cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
cancel
Showing results for 
Search instead for 
Did you mean: 

What is the difference between Databricks SQL vs Databricks cluster with Photon runtime?

jwilliam
Contributor
1 ACCEPTED SOLUTION

Accepted Solutions

BilalAslamDbrx
Honored Contributor II
Honored Contributor II

Great question! There are similarities and differences:

Similarities

  • Photon is enabled on both
  • You have Databricks Runtime on both

Differences

  • Databricks Runtime (DBR) version is managed and auto-upgraded in Databricks SQL. Because SQL is a narrower workload than, say, data science, we automatically manage the version of DBR that runs on Databricks SQL Endpoints. This is a good thing - you don't have to worry about upgrading etc.
  • DBR behaves slightly differently on SQL Endpoints compared to Clusters. This is again a good thing. Mostly we optimize for the SQL workload and set configs automatically so you don't have to.
  • SQL Endpoints are actually behind a scalable gateway proxy. This proxy can, among other things, scale out the clusters as your SQL workload scales up or down. This brings elasticity to your workloads. A bunch of stuff like caching and metadata processing go here, too, to speed things up.

TL;DR if you are doing SQL/BI, please consider using SQL Endpoints, it's generally the best choice for that workload.

View solution in original post

6 REPLIES 6

Anonymous
Not applicable

They are very similar. Databricks SQL uses compute that has photon enabled. A traditional cluster with photon enabled does allow for a few more configurations to be set around the cluster architecture and settings. The traditional cluster will also have more libraries installed as it needs to run things in various languages, where the endpoints only needs SQL APIs.

https://docs.databricks.com/runtime/photon.html#limitations. This lists some limitations, although additional data source reads is in preview now.

jwilliam
Contributor

Thank you. Will traditional cluster support serverless execution in the future or only SQL endpoints support that?

And are there any optimization tweaks in Databricks SQL that makes it perhaps faster than traditional Databricks cluster running only SQL queries?

Anonymous
Not applicable

Serverless for traditional compute is in preview for single node machines and multinode cluster serverless is on the roadmap.

I'm sure there are a few optimizations that makes things faster. Simple things such as caching metadata in the metastore helps.

Hubert-Dudek
Esteemed Contributor III

I wouldn't call them the same as Databricks SQL runtime is a bit different (not everything is supported for example UDFs), new releases are separated from standard runtimes updates: https://docs.databricks.com/sql/release-notes/index.html

Databricks cluster can handle notebooks. SQL endpoint is only for SQL queries.

Both can be in photon or non-photon versions. Photon has a bunch of improvements for example better handle small files problem.

BilalAslamDbrx
Honored Contributor II
Honored Contributor II

Great question! There are similarities and differences:

Similarities

  • Photon is enabled on both
  • You have Databricks Runtime on both

Differences

  • Databricks Runtime (DBR) version is managed and auto-upgraded in Databricks SQL. Because SQL is a narrower workload than, say, data science, we automatically manage the version of DBR that runs on Databricks SQL Endpoints. This is a good thing - you don't have to worry about upgrading etc.
  • DBR behaves slightly differently on SQL Endpoints compared to Clusters. This is again a good thing. Mostly we optimize for the SQL workload and set configs automatically so you don't have to.
  • SQL Endpoints are actually behind a scalable gateway proxy. This proxy can, among other things, scale out the clusters as your SQL workload scales up or down. This brings elasticity to your workloads. A bunch of stuff like caching and metadata processing go here, too, to speed things up.

TL;DR if you are doing SQL/BI, please consider using SQL Endpoints, it's generally the best choice for that workload.

Kaniz
Community Manager
Community Manager

Hi @John William​, Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.