Partition filter is skipped when table is used in where condition, why?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-12-2023 01:24 AM
Hi,
maybe someone can help me i do want to run very narrow query
SELECT *
FROM my_table
WHERE snapshot_date IN('2023-01-06', '2023-01-07')
-- part of the physical plan:
-- Location: PreparedDeltaFileIndex [dbfs:/...]
-- PartitionFilters: [cast(snapshot_date#282634 as string) IN (2023-01-06,2033-01-07)]it is very fast ~1s, in the execution plan i see it is using those provided datasets as arguments for partition filters.
If I try to provide filter predicate in form of the one column table it does full table scan and it takes 100x longer.
%sql
SELECT *
FROM
my_table
WHERE snapshot_date IN (
SELECT snapshot_date
FROM (VLUES('2023-01-06'), ('2023-01-07')) T(snapshot_date)
)
-- plan
Location: PreparedDeltaFileIndex [dbfs:/...]
ReadSchema: ...Is there a way to provide hints for parition filters or something to make sure it does not read whole table?
Labels:
- Labels:
-
Table