cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

sql delete?

dan11
New Contributor II

<pre> Hello databricks people, I started working with databricks today. I have a sql script which I developed with sqlite3 on a laptop. I want to port the script to databricks. I started with two sql statements: select count(prop_id) from prop0; delete from prop0 where prop_id is null; They seem like simple statements. When I run them on data bricks I see this: Unsupported language features in query: delete from prop0 where prop_id is null. I find it hard to believe that databricks does not support statement: delete from prop0 where prop_id is null Am I doing something wrong? Is it reasonable to expect that databricks should support: delete from prop0 where prop_id is null ?? </pre>

4 REPLIES 4

dan11
New Contributor II

The js-editor in this forum is horrible; I tried using pre-tags because of the angle-brackets I see in the upper menu and they got sanitized.

vida
Databricks Employee
Databricks Employee

Hi Dan,

Spark SQL is based on HiveQL. It allows you to use SQL syntax to do big data, like count your data. It does not, however, support operations like delete and update. I cover why in my talk here:

https://spark-summit.org/east-2016/events/not-your-fathers-database-how-to-use-apache-spark-properly...

-Vida

BaranitharanV
New Contributor II

Vida, there is a document in Databricks menitoning that deletes are permitted on Delta.

https://docs.databricks.com/spark/latest/spark-sql/language-manual/delete.html

Are we missing anything here?

Bill_Chambers
Contributor II

Hey Dan, good to hear you're getting started with Databricks. This is not a limitation of Databricks it's a restriction built into Spark itself. Spark is not a data store, it's a distributed computation framework. Therefore deleting data would be unnecessary. If you don't need it, you would just filter it out either in a query or by setting it up as a new table as below.

%sql SELECT * FROM prop0 where prop_id is null AS new_table

It's probably worth your time reading a bit more about the tools that Spark provides, the learning curve is steep but once you get past the first steps you'll start seeing the value! 🙂 I might recommend some of the material that we have in the community edition like some of the CS100 coursework.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group