SparkOutOfMemoryError when merging data into a table that already has data

vannipart
New Contributor III

Hello, 

There is an issue with merging data from a dataframe into a table 

2024 databricksJob aborted due to stage failure: Task 17 in stage 1770.0 failed 4 times, most recent failure: Lost task 17.3 in stage 1770.0 (TID 1669) (1x.xx.xx.xx executor 8): org.apache.spark.memory.SparkOutOfMemoryError: [UNABLE_TO_ACQUIRE_MEMORY] Unable to acquire 28 bytes of memory, got 0.

 

There script: 

 

 
 
df.createOrReplaceTempView("df_re")

 

%sql
MERGE INTO catalog.schema.table target USING df_re source
ON target.DB_ID = source.DB_ID
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *

The data amount is small like 200k rows or even smaller

"node_type_id": "Standard_D16as_v5"

"spark_version": "14.3.x-scala2.12"

Cluster has no sparks configurations- 

Unity catalog is in use and delta tables are in external location.

One thing is that the notebook that his merge is run has a lot of dataframes and other data transformations for creating this dataframe that is then create into a TempView. 

It is a mystery and have no idea how to solve this, it is not a data issue, that is for sure.

Any tips and help is welcome