cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

What is the difference between Databricks SQL vs Databricks cluster with Photon runtime?

jwilliam
Contributor
1 ACCEPTED SOLUTION

Accepted Solutions

BilalAslamDbrx
Databricks Employee
Databricks Employee

Great question! There are similarities and differences:

Similarities

  • Photon is enabled on both
  • You have Databricks Runtime on both

Differences

  • Databricks Runtime (DBR) version is managed and auto-upgraded in Databricks SQL. Because SQL is a narrower workload than, say, data science, we automatically manage the version of DBR that runs on Databricks SQL Endpoints. This is a good thing - you don't have to worry about upgrading etc.
  • DBR behaves slightly differently on SQL Endpoints compared to Clusters. This is again a good thing. Mostly we optimize for the SQL workload and set configs automatically so you don't have to.
  • SQL Endpoints are actually behind a scalable gateway proxy. This proxy can, among other things, scale out the clusters as your SQL workload scales up or down. This brings elasticity to your workloads. A bunch of stuff like caching and metadata processing go here, too, to speed things up.

TL;DR if you are doing SQL/BI, please consider using SQL Endpoints, it's generally the best choice for that workload.

View solution in original post

5 REPLIES 5

Anonymous
Not applicable

They are very similar. Databricks SQL uses compute that has photon enabled. A traditional cluster with photon enabled does allow for a few more configurations to be set around the cluster architecture and settings. The traditional cluster will also have more libraries installed as it needs to run things in various languages, where the endpoints only needs SQL APIs.

https://docs.databricks.com/runtime/photon.html#limitations. This lists some limitations, although additional data source reads is in preview now.

jwilliam
Contributor

Thank you. Will traditional cluster support serverless execution in the future or only SQL endpoints support that?

And are there any optimization tweaks in Databricks SQL that makes it perhaps faster than traditional Databricks cluster running only SQL queries?

Anonymous
Not applicable

Serverless for traditional compute is in preview for single node machines and multinode cluster serverless is on the roadmap.

I'm sure there are a few optimizations that makes things faster. Simple things such as caching metadata in the metastore helps.

Hubert-Dudek
Esteemed Contributor III

I wouldn't call them the same as Databricks SQL runtime is a bit different (not everything is supported for example UDFs), new releases are separated from standard runtimes updates: https://docs.databricks.com/sql/release-notes/index.html

Databricks cluster can handle notebooks. SQL endpoint is only for SQL queries.

Both can be in photon or non-photon versions. Photon has a bunch of improvements for example better handle small files problem.

BilalAslamDbrx
Databricks Employee
Databricks Employee

Great question! There are similarities and differences:

Similarities

  • Photon is enabled on both
  • You have Databricks Runtime on both

Differences

  • Databricks Runtime (DBR) version is managed and auto-upgraded in Databricks SQL. Because SQL is a narrower workload than, say, data science, we automatically manage the version of DBR that runs on Databricks SQL Endpoints. This is a good thing - you don't have to worry about upgrading etc.
  • DBR behaves slightly differently on SQL Endpoints compared to Clusters. This is again a good thing. Mostly we optimize for the SQL workload and set configs automatically so you don't have to.
  • SQL Endpoints are actually behind a scalable gateway proxy. This proxy can, among other things, scale out the clusters as your SQL workload scales up or down. This brings elasticity to your workloads. A bunch of stuff like caching and metadata processing go here, too, to speed things up.

TL;DR if you are doing SQL/BI, please consider using SQL Endpoints, it's generally the best choice for that workload.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group