Spark Out of Memory Error

leungi
Contributor

Background

Using R language's {sparklyr} package to fetch data from tables in Unity Catalog, and faced the error below.

Tried the following, to no avail:

  • Using memory optimized cluster - e.g., E4d.
  • Using bigger (RAM) cluster - e.g., E8d.
  • Enable auto-scaling.
  • Setting spark config:
    • spark.driver.maxResultSize 4096
    • spark.memory.offHeap.enabled true
    • spark.driver.memory 8082
    • spark.executor.instances 4
    • spark.memory.offHeap.size 7284
    • spark.executor.memory 7284
    • spark.executor.cores 4

Error

Error : org.apache.spark.memory.SparkOutOfMemoryError: Total memory usage during row decode exceeds spark.driver.maxResultSize (4.0 GiB). The average row size was 48.0 B, with 2.9 GiB used for temporary buffers. Run `sparklyr::spark_last_error()` to see the full Spark error (multiple lines) To use the previous style of error message set `options("sparklyr.simple.errors" = TRUE)` Error:
Error: ! org.apache.spark.memory.SparkOutOfMemoryError: Total memory usage during row decode exceeds spark.driver.maxResultSize (4.0 GiB). The average row size was 48.0 B, with 2.9 GiB used for temporary buffers. Run `sparklyr::spark_last_error()` to see the full Spark error (multiple lines) To use the previous style of error message set `options("sparklyr.simple.errors" = TRUE)`