- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-16-2021
08:25 AM
- last edited on
03-21-2025
05:57 AM
by
Advika
I see the option to enable Photon when creating a new SQL Endpoint. The description says that enabling it helps speed up up queries, which sounds good, but are there any downsides I need to be aware of?
- Labels:
-
SQL
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 02:13 PM
Generally, yes you should enable photon. The majority of functionality is available and will perform extremely well. There are some limitations with it that can be found here.
Limitations:
- Works on Delta and Parquet tables only for both read and write.
- Does not support the following data types:
- Map
- Array
- Does not support window and sort operators
- Does not support Spark Structured Streaming.
- Does not support UDFs.
- Not expected to improve operations bottlenecked by network or scan I/O.
- Not expected to improve short-running queries (<2 seconds), for example, against small data.
Advantages:
- Supports SQL and equivalent DataFrame operations against Delta and Parquet tables.
- Expected to accelerate queries that process a significant amount of data (100GB+) and include aggregations and joins.
- Data is accessed repeatedly and likely in the Delta Lake cache.
- More robust scan performance on tables with many columns and many small files.
- Faster Delta and Parquet writing using update, delete, merge into, and create table as select, especially for wide tables (hundreds to thousands of columns).
- Photon replaces sort-merge joins with hash-joins.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 02:13 PM
Generally, yes you should enable photon. The majority of functionality is available and will perform extremely well. There are some limitations with it that can be found here.
Limitations:
- Works on Delta and Parquet tables only for both read and write.
- Does not support the following data types:
- Map
- Array
- Does not support window and sort operators
- Does not support Spark Structured Streaming.
- Does not support UDFs.
- Not expected to improve operations bottlenecked by network or scan I/O.
- Not expected to improve short-running queries (<2 seconds), for example, against small data.
Advantages:
- Supports SQL and equivalent DataFrame operations against Delta and Parquet tables.
- Expected to accelerate queries that process a significant amount of data (100GB+) and include aggregations and joins.
- Data is accessed repeatedly and likely in the Delta Lake cache.
- More robust scan performance on tables with many columns and many small files.
- Faster Delta and Parquet writing using update, delete, merge into, and create table as select, especially for wide tables (hundreds to thousands of columns).
- Photon replaces sort-merge joins with hash-joins.

