In our project, we are testing liquid clustering using a test table called status_update, where we need to update the status for different market IDs. We are attempting to update the status_update table in parallel using the UPDATE command.
ALTER TABLE status_update CLUSTER BY (mkt_id) spark.sql(f"UPDATE status_update SET status='{status}' WHERE mkt_id={mkt_id}")
However, when running the notebook in parallel for different market IDs, we encounter a concurrency issue.