Does it still make sense to run this job on a cluster with Photon enable when I am receiving the following?
This is the code I ran:
CREATE OR REPLACE TABLE ${tbl_name}_dups
SELECT src.*,
ROW_NUMBER() OVER (
PARTITION BY src.id
ORDER BY src.id
) row_num
FROM ${tbl_name} src
INNER JOIN (select id, count(*) n
from ${tbl_name}
group by id
having n > 1) dups
ON src.id = dups.id;
This is what I see under the cluster UI: SQL/Dataframe section
== Photon Explanation ==
Photon does not fully support the query because:
Unsupported expression(s): dynamicpruning#601 600
reference node:
FileScan parquet spark_catalog.default.opensea_events[approved_account#466,asset#467,asset_bundle#468,auction_type#469,bid_amount#470,collection_slug#471,contract_address#472,created_date#473,custom_event_name#474,dev_fee_payment_event#475,dev_seller_fee_basis_points#476L,duration#477,ending_price#478,event_timestamp#479,from_account#481,id#482L,is_private#483,listing_time#484,owner_account#485,payment_token#486,quantity#487,seller#488,starting_price#489,to_account#490,... 5 more fields] Batched: true, DataFilters: [isnotnull(id#482L), dynamicpruning#601 600], Format: Parquet, Location: PreparedDeltaFileIndex(1 paths)[dbfs:/mnt/opensea-tbl/asset_events_2], PartitionFilters: [], PushedFilters: [IsNotNull(id)], ReadSchema: struct<approved_account:string,asset:struct<animation_original_url:string,animation_url:string,as...