cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Spark Out of Memory Error

leungi
Contributor

Background

Using R language's {sparklyr} package to fetch data from tables in Unity Catalog, and faced the error below.

Tried the following, to no avail:

  • Using memory optimized cluster - e.g., E4d.
  • Using bigger (RAM) cluster - e.g., E8d.
  • Enable auto-scaling.
  • Setting spark config:
    • spark.driver.maxResultSize 4096
    • spark.memory.offHeap.enabled true
    • spark.driver.memory 8082
    • spark.executor.instances 4
    • spark.memory.offHeap.size 7284
    • spark.executor.memory 7284
    • spark.executor.cores 4

Error

Error : org.apache.spark.memory.SparkOutOfMemoryError: Total memory usage during row decode exceeds spark.driver.maxResultSize (4.0 GiB). The average row size was 48.0 B, with 2.9 GiB used for temporary buffers. Run `sparklyr::spark_last_error()` to see the full Spark error (multiple lines) To use the previous style of error message set `options("sparklyr.simple.errors" = TRUE)` Error:
Error: ! org.apache.spark.memory.SparkOutOfMemoryError: Total memory usage during row decode exceeds spark.driver.maxResultSize (4.0 GiB). The average row size was 48.0 B, with 2.9 GiB used for temporary buffers. Run `sparklyr::spark_last_error()` to see the full Spark error (multiple lines) To use the previous style of error message set `options("sparklyr.simple.errors" = TRUE)`
1 REPLY 1

@Retired_mod , thanks for the detailed suggestions.

I believe the first reference relates to the issue; however, after adjusting spark.driver.maxResultSize  to various  values - e.g., 10g, 20g, 30g - a new error ensues (see below).

The operation involves a collect() on a Delta table with 380 MM rows and 5 columns (3.2GB, partitioned into 55 files). If the average row size is 48Bytes (per initial error), shouldn't 20GBytes be sufficient?

New Error

The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.
at com.databricks.spark.chauffeur.Chauffeur.onDriverStateChange(Chauffeur.scala:1367)

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group