cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

IDENTIFIER not working in UPDATE

marcuskw
Contributor

The following code works perfectly fine:

 

df = spark.createDataFrame([('A', 1), ('B', 2)])
df.createOrReplaceTempView('temp')

spark.sql("""
    SELECT
        IDENTIFIER(:col)
    FROM
        temp
""",
    args={
        "col": "_1"
    }
).display()

 

However if we instead want to do an UPDATE the same logic no longer functions:

spark.sql("""
    UPDATE
        temp
    SET
        IDENTIFIER(:col) = "C"
""",
    args={
        "col": "_1"
    }
)

Here we get the following error:
[PARSE_SYNTAX_ERROR] Syntax error at or near '(': missing EQ. SQLSTATE: 42601

Updating a temp table is not allowed, but used this quickly for a debug example.
When running the following UPDATE we get the correct error message "UPDATE TABLE is not supported temporarily.":

 

spark.sql("""
    UPDATE
        temp
    SET
        _1 = "C"
""")

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

marcuskw
Contributor
6 REPLIES 6

Kaniz_Fatma
Community Manager
Community Manager

Hi @marcuskwIn Spark SQL, updating temporary tables is not directly supported. A few alternatives are:- 

  • Create a New DataFrame: Change your data in a new DataFrame and overwrite the temp table.
  • Use Delta Lake: Delta Lake provides ACID transactions and robust data management for complex scenarios.
  • Persist Data: Write data to a permanent or Delta Lake table to keep it across sessions.

Let me know if you need further assistance or have any other questions!

 

 

Hi,

I wrote it in the bottom of my post that I know that updating a Temp table isn't allowed.
I used it as a quick example to illustrate the error message, but you get the same error message when updating a Delta Lake table.

The problem is the IDENTIFIER() syntax doesn't seem to function properly in an UPDATE statement. But works perfectly fine in a SELECT.

"Updating a temp table is not allowed, but used this quickly for a debug example.
When running the following UPDATE we get the correct error message "UPDATE TABLE is not supported temporarily.""

Witold
Contributor III

Hey @marcuskw the docs state that this is actually not supported. You can only use table names in an update statement:


  • Table name of a MERGE, UPDATE, DELETE, INSERT, COPY INTO

 

Doesn't the docs say it is supported? (DML)

"A column or view referenced in a query. This includes queries embedded in a DDL or DML statement.
When using the identifier clause it may not be embedded within an identifier."

Here I'm referencing a column in an UPDATE statement.

spark.sql("""
    UPDATE
        temp
    SET
        IDENTIFIER(:col) = "C"
""",
    args={
        "col": "_1"
    }
)

 

Witold
Contributor III

Hey @marcuskw!

Sorry for the late response. IMHO the docs are not clear on that, but IDENTIFIER's are not supported in `SET` clauses. It works in `WHERE` clauses though (which of course will not help in your case):

spark.sql("""
    UPDATE
        temp
    SET
        _1 = "C"
    WHERE
        IDENTIFIER(:col) <> "C"
""",
    args={
        "col": "_1"
    }
)

 @Kaniz_Fatma You should update the docs accordingly.

marcuskw
Contributor

If it helps anyone else I found this article that described a few limitations:
https://community.databricks.com/t5/technical-blog/how-not-to-build-an-execute-immediate-demo/ba-p/8...

marcuskw_0-1723790886149.png

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group