Databricks Community

vannipart · ‎07-17-2024

Hello,

There is an issue with merging data from a dataframe into a table

2024 databricksJob aborted due to stage failure: Task 17 in stage 1770.0 failed 4 times, most recent failure: Lost task 17.3 in stage 1770.0 (TID 1669) (1x.xx.xx.xx executor 8): org.apache.spark.memory.SparkOutOfMemoryError: [UNABLE_TO_ACQUIRE_MEMORY] Unable to acquire 28 bytes of memory, got 0.

There script:

df.createOrReplaceTempView("df_re")

%sql

MERGE INTO catalog.schema.table target USING df_re source

ON target.DB_ID = source.DB_ID

WHEN MATCHED THEN UPDATE SET *

WHEN NOT MATCHED THEN INSERT *

The data amount is small like 200k rows or even smaller

"node_type_id": "Standard_D16as_v5"

"spark_version": "14.3.x-scala2.12"

Cluster has no sparks configurations-

Unity catalog is in use and delta tables are in external location.

One thing is that the notebook that his merge is run has a lot of dataframes and other data transformations for creating this dataframe that is then create into a TempView.

It is a mystery and have no idea how to solve this, it is not a data issue, that is for sure.

Any tips and help is welcome

vannipart · ‎08-11-2024

Hello Kaniz_Fatma,

The problem wasn't anything related to listed things up here, it was bad data modelling and how relation inside the table was created. Remodelling data helped

View solution in original post

vannipart · ‎08-11-2024

Hello Kaniz_Fatma,

The problem wasn't anything related to listed things up here, it was bad data modelling and how relation inside the table was created. Remodelling data helped

Databricks Community

SparkOutOfMemoryError when merging data into a table that already has data

Join Us as a Local Community Builder!

What's new in Databricks: July - August 2025

How to use Lakebase as a transactional data layer for Databricks Apps

🌟 Community Spark of the Week | Aug 22 – Aug 28 🌟

EMEA Learning Festival: Hands-on Learning Journey!

Virtual Learning Festival: 10 October - 31 October 2025