cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to update value of a column with MAP data-type in a delta table using a python dictionary and SQL UPDATE command?

Mado
Valued Contributor II

I have a delta table created by:

%sql
 
CREATE TABLE IF NOT EXISTS dev.bronze.test_map (
    id INT,
    table_updates MAP<STRING, TIMESTAMP>,
 
  CONSTRAINT test_map_pk PRIMARY KEY(id)
  
  ) USING DELTA
LOCATION "abfss://bronze@Table Path"

With initial values:

INSERT INTO dev.bronze.test_map 
VALUES (1, null), 
       (2, null), 
       (3, null);

Note that there is no value in column "table_updates".

image 

After processing other tables in our platform, I have table updates info as a python dictionary like below:

table_updates_id1  =  
 
{'id1_table_1': datetime.datetime(2023, 3, 26, 4, 33, 22, 323000),
 
 'id1_table_2': datetime.datetime(2023, 3, 26, 4, 33, 22, 323000)}

image.png 

Now, I want to update the value of column "table_update " where id=1 using SQL UPDATE command (note that I want to update the table not dataframe).

I tried different methods but failed.

1st trial:

spark.sql(f"""
UPDATE dev.bronze.test_map
SET
    table_updates = map({table_updates_id1})
WHERE
    id = 1
""")

Error:

image 

2nd trial:

spark.sql(f"""
UPDATE dev.bronze.test_map
SET
    table_updates = map('{','.join([f'{k},{v}' for k,v in table_updates_id1.items()])}')
WHERE
    id = 1
""")

Error:

image 

Any idea how to solve this issue? Thanks.

3 REPLIES 3

Anonymous
Not applicable

@Mohammad Saber​ :

The error messages you received indicate that the SQL UPDATE command is expecting a column of type map<string,timestamp>, but is instead receiving a string or a non-matching datatype. To resolve this issue, you can convert the Python dictionary to a valid SQL map format using the map_from_entries function in Spark SQL.

Here's an example of how you can use the map_from_entries function to update the table_updates column in your delta table:

from pyspark.sql.functions import map_from_entries
 
# Convert Python dictionary to list of key-value pairs
table_updates_list = list(table_updates_id1.items())
 
# Convert list of key-value pairs to SQL map format
table_updates_map = map_from_entries(table_updates_list)
 
# Use SQL UPDATE command to update delta table
spark.sql(f"""
    UPDATE dev.bronze.test_map
    SET
        table_updates = {table_updates_map}
    WHERE
        id = 1
""")

In this example, the map_from_entries function converts the table_updates_id1 dictionary to a list of key-value pairs, which is then passed to the map_from_entries function to create a SQL map. The resulting SQL map is then used in the SQL UPDATE command to update the table_updates column for rows where id = 1,

Mado
Valued Contributor II

Thanks for your help.

I get the following error when running the following code:

table_updates_map = map_from_entries(table_updates_list)

image

Anonymous
Not applicable

Hi @Mohammad Saber​ 

Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.

Please help us select the best solution by clicking on "Select As Best" if it does.

Your feedback will help us ensure that we are providing the best possible service to you. Thank you!

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!