What is databricks SQL, spark SQL and how are they different from MS SQL ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2024 04:36 AM
Hello Databricks Community,
I have a hard time understanding how is Databricks SQL different from microsoft SQL ? Also, why does databricks provide spark SQL ?
If you direct me to a well-written webpage or document its of immense help!
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2024 07:56 AM - edited 12-19-2024 07:57 AM
Databricks SQL is the Product Name.
https://www.databricks.com/product/databricks-sql
Spark SQL is the SQL interface (SQL commands) that run by utilizing the power of Spark as a computing engine.
https://www.databricks.com/glossary/what-is-spark-sql
MSSQL (assuming you're talking here about MSFT SQL Server) is a relational database management system developed by Microsoft and the language it supports is also SQL but has proprietary functions etc that only work there. Similar to how Oracle would have PL/SQL.
All of these "SQL" variants try to align to the ANSI standard but each engine can have it's own extensions, functions etc, so some of your code might be transferrable from system to system but lots of it won't be.
This is why you need transpilers like https://github.com/tobymao/sqlglot to help with those conversions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2024 11:37 AM
I assume by "microsoft sql" you mean Microsoft's Transact SQL (or T-SQL) language, not Microsoft SQL Server?
Databricks SQL and T-SQL are both based on ANSI SQL with some platform-specific language extensions. This means most of the basics of SQL are the same--SELECT, WHERE, JOIN, etc. However, there are syntax differences, for example in T-SQL we would write "select top 100 * from ..." where in DB-SQL we'd write "select * from ... limit 100". A comprehensive list of differences is well beyond the scope of a forum post, but the DB-SQL language reference is at SQL language reference - Azure Databricks - Databricks SQL | Microsoft Learn. The languages are similar enough, and Databricks Assistant is really good, so just write a DB-SQL query like you would a T-SQL query. It might work, and if not, the Assistant will help you fix it.
In addition to querying data, DB-SQL is also used to manage the platform and perform administrative functions, and all of those commands are specific to Databricks.
Databricks provides Spark SQL because Databricks compute is built on Spark.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-03-2025 01:48 AM
Databricks SQL and Spark SQL are built for distributed big data analytic. Databricks SQL is great for business intelligence tools and uses Delta Lake for efficient data storage. Spark SQL works with Spark's programming features for data processing. Unlike MS SQL, which is designed for centralized databases and structured data, these tools handle huge datasets, including semi-structured and unstructured data, in a distributed way.

