Filtering delta table by CONCAT of a partition column and a non-partition one

darioAnt — Wed, 31 May 2023 08:37:33 GMT

Hi,

I know how filtering a delta table on a partition column is a very powerful time-saving approach, but what if this column appears as a CONCAT in the where-clause?

I explain my case: I have a delta table with only one partition column, say called col1. I need to interrogate this table through an API request by using a serverless SQL warehouse in Databricks SQL, and for my purpose it is simpler to implement a filter as a CONCAT of col1 together with another column.

Is Spark smart enough to understand that this table is partitioned on one of the two columns, or do I lose the partition info?

Thanks

Re: Filtering delta table by CONCAT of a partition column and a non-partition one

darioAnt — Wed, 31 May 2023 13:21:20 GMT

I did myself a test and the answer is no:

with a Concat filter, spark sql does not know I am using a partition-based column, so it scan all the table. 😞

topic Filtering delta table by CONCAT of a partition column and a non-partition one in Data Engineering

Filtering delta table by CONCAT of a partition column and a non-partition one

Re: Filtering delta table by CONCAT of a partition column and a non-partition one