cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Spark Out of Memory Error

leungi
Contributor

Background

Using R language's {sparklyr} package to fetch data from tables in Unity Catalog, and faced the error below.

Tried the following, to no avail:

  • Using memory optimized cluster - e.g., E4d.
  • Using bigger (RAM) cluster - e.g., E8d.
  • Enable auto-scaling.
  • Setting spark config:
    • spark.driver.maxResultSize 4096
    • spark.memory.offHeap.enabled true
    • spark.driver.memory 8082
    • spark.executor.instances 4
    • spark.memory.offHeap.size 7284
    • spark.executor.memory 7284
    • spark.executor.cores 4

Error

Error : org.apache.spark.memory.SparkOutOfMemoryError: Total memory usage during row decode exceeds spark.driver.maxResultSize (4.0 GiB). The average row size was 48.0 B, with 2.9 GiB used for temporary buffers. Run `sparklyr::spark_last_error()` to see the full Spark error (multiple lines) To use the previous style of error message set `options("sparklyr.simple.errors" = TRUE)` Error:
Error: ! org.apache.spark.memory.SparkOutOfMemoryError: Total memory usage during row decode exceeds spark.driver.maxResultSize (4.0 GiB). The average row size was 48.0 B, with 2.9 GiB used for temporary buffers. Run `sparklyr::spark_last_error()` to see the full Spark error (multiple lines) To use the previous style of error message set `options("sparklyr.simple.errors" = TRUE)`
2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @leungi

  1. Since the error indicates that the total memory usage during row decode exceeds spark.driver.maxResultSize, you might try increasing this value beyond 4.0 GiB.
  2. Repartition your data to increase the number of partitions. This can help distribute the data more evenly across the cluster and reduce the memory load on indiv...
  3. Ensure that the memory configurations are set appropriately. For example:

    • spark.driver.memory and spark.executor.memory should be set to values that your cluster can handle.
    • spark.memory.offHeap.size should be adjusted based on your off-heap memory requirements.
  4. Try to optimize your data processing logic to reduce memory consumption. This might include filtering out unnecessary data early in the processing pipeline or using more mem...
  5. If you have large lookup tables or static data, consider using broadcast variables to distribute this data efficiently across the cluster.
  6. Use Sparkโ€™s monitoring tools to identify which stages of your job are consuming the most memory. This can help you pinpoint specific areas to optimize.

If these suggestions donโ€™t resolve the issue, you might need to provide more details about your data and processing logic for a more tailored solution. Let me know if you need further assistance!

@Kaniz_Fatma , thanks for the detailed suggestions.

I believe the first reference relates to the issue; however, after adjusting spark.driver.maxResultSize  to various  values - e.g., 10g, 20g, 30g - a new error ensues (see below).

The operation involves a collect() on a Delta table with 380 MM rows and 5 columns (3.2GB, partitioned into 55 files). If the average row size is 48Bytes (per initial error), shouldn't 20GBytes be sufficient?

New Error

The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.
at com.databricks.spark.chauffeur.Chauffeur.onDriverStateChange(Chauffeur.scala:1367)

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group