Databricks Jobs & Pipelines: Serverless SparkOutOf...

LarsMewa · ‎09-23-2025

I'm getting the following SparkOutOfMemoryError message while reading a 500mb json file, see below. I'm loading four csv files (around 150mb per file) and the json file in the same pipeline. When I load the json file alone it reads fine, same when I load everything with a cluster.

Anyone has an idea how to tweak serverless to read the json while processing the csv files?

Job aborted due to stage failure: Task 0 in stage 153.0 failed 4 times, most recent failure: Lost task 0.3 in stage 153.0 (TID 551) (10.46.122.241 executor 0): org.apache.spark.memory.SparkOutOfMemoryError: Photon ran out of memory while executing this query.
Photon failed to reserve 768.0 MiB for simdjson internal usage, in SimdJsonReader, in JsonFileScanNode(id=8883, output_schema=[string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 3 more]), in task.
Memory usage:
Total task memory (including non-Photon): 1152.0 MiB
task: allocated 262.1 MiB, tracked 1152.0 MiB, untracked allocated 0.0 B, peak 1152.0 MiB
BufferPool: allocated 6.1 MiB, tracked 128.0 MiB, untracked allocated 0.0 B, peak 128.0 MiB
DataWriter: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
Photon Protobuf Plan Arena: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 110.8 KiB
JsonFileScanNode(id=8883, output_schema=[string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 3 more]): allocated 256.0 MiB, tracked 1024.0 MiB, untracked allocated 0.0 B, peak 1024.0 MiB
JniReader: allocated 1984.0 B, tracked 1984.0 B, untracked allocated 0.0 B, peak 1984.0 B
SimdJsonReader: allocated 256.0 MiB, tracked 1024.0 MiB, untracked allocated 0.0 B, peak 1024.0 MiB
JSON buffer: allocated 256.0 MiB, tracked 256.0 MiB, untracked allocated 0.0 B, peak 256.0 MiB
simdjson internal usage: allocated 0.0 B, tracked 768.0 MiB, untracked allocated 0.0 B, peak 768.0 MiB
ProjectNode(id=8893, output_schema=[string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 3 more]): allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
ProjectNode(id=8908, output_schema=[string, struct<string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 4 more>]): allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
SortNode(id=8911, output_schema=[string, struct<string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 4 more>]): allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
Sorter: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
spilled run buffers: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
output batch var len data: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
Memory consumers:
Acquired by com.databricks.photon.NativeMemoryConsumer@9cc6126: 1152.0 MiB

at 0xbca6493 <photon>.CreateReservationError(external/workspace_spark_3_5/photon/common/memory-tracker.cc:561)
at 0xbca51c7 <photon>.GrowBuffer(external/workspace_spark_3_5/photon/io/json/simd-json-reader.cc:295)
at 0x77b6d5f <photon>.TryLoadDocumentsFromStream(external/workspace_spark_3_5/photon/io/json/simd-json-reader.cc:313)
at 0x77b70e3 <photon>.HasNext(external/workspace_spark_3_5/photon/io/json/simd-json-reader.cc:365)
at 0x6e5444b <photon>.ReaderHasNext(external/workspace_spark_3_5/photon/exec-nodes/common-file-scan-node.h:139)
at 0x6e5405b <photon>.HasNextImpl(external/workspace_spark_3_5/photon/exec-nodes/json-file-scan-node.cc:121)
at 0x6d7c5e7 <photon>.OpenImpl(external/workspace_spark_3_5/photon/exec-nodes/sort-node.cc:140)
at com.databricks.photon.JniApiImpl.open(Native Method)
at com.databricks.photon.JniApi.open(JniApi.scala)
at com.databricks.photon.JniExecNode.open(JniExecNode.java:73)
at com.databricks.photon.PhotonColumnarBatchResultHandler.$anonfun$getResult$4(PhotonExec.scala:1224)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.photon.PhotonResultHandler.timeit(PhotonResultHandler.scala:30)
at com.databricks.photon.PhotonResultHandler.timeit$(PhotonResultHandler.scala:28)
at com.databricks.photon.PhotonColumnarBatchResultHandler.timeit(PhotonExec.scala:1216)
at com.databricks.photon.PhotonColumnarBatchResultHandler.getResult(PhotonExec.scala:1224)
at com.databricks.photon.PhotonBasicEvaluatorFactory$PhotonBasicEvaluator$$anon$1.open(PhotonBasicEvaluatorFactory.scala:252)
at com.databricks.photon.PhotonBasicEvaluatorFactory$PhotonBasicEvaluator$$anon$1.hasNextImpl(PhotonBasicEvaluatorFactory.scala:257)
at com.databricks.photon.PhotonBasicEvaluatorFactory$PhotonBasicEvaluator$$anon$1.$anonfun$hasNext$1(PhotonBasicEvaluatorFactory.scala:275)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at com.databricks.photon.metrics.BillableTimeTaskMetrics.withPhotonBilling(BillableTimeTaskMetrics.scala:71)
at org.apache.spark.TaskContext.runFuncAsBillable(TaskContext.scala:267)
at com.databricks.photon.PhotonBasicEvaluatorFactory$PhotonBasicEvaluator$$anon$1.hasNext(PhotonBasicEvaluatorFactory.scala:275)
at com.databricks.photon.CloseableIterator$$anon$10.hasNext(CloseableIterator.scala:211)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:50)
at org.apache.spark.sql.execution.aggregate.SortAggregateExec.$anonfun$doExecute$1(SortAggregateExec.scala:67)
at org.apache.spark.sql.execution.aggregate.SortAggregateExec.$anonfun$doExecute$1$adapted(SortAggregateExec.scala:64)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:932)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2$adapted(RDD.scala:932)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
at org.apache.spark.rdd.RDD.$anonfun$computeOrReadCheckpoint$1(RDD.scala:420)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:417)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:384)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
at org.apache.spark.rdd.RDD.$anonfun$computeOrReadCheckpoint$1(RDD.scala:420)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:417)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:384)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:83)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:82)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:58)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:39)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:227)
at org.apache.spark.scheduler.Task.doRunTask(Task.scala:204)
at org.apache.spark.scheduler.Task.$anonfun$run$5(Task.scala:166)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51)
at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104)
at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:109)
at scala.util.Using$.resource(Using.scala:269)
at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:108)
at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:160)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.Task.run(Task.scala:105)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$11(Executor.scala:1227)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:80)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:77)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:112)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:1231)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:1083)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)

Driver stacktrace:
Photon ran out of memory while executing this query.
Photon failed to reserve 768.0 MiB for simdjson internal usage, in SimdJsonReader, in JsonFileScanNode(id=8883, output_schema=[string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 3 more]), in task.
Memory usage:
Total task memory (including non-Photon): 1152.0 MiB
task: allocated 262.1 MiB, tracked 1152.0 MiB, untracked allocated 0.0 B, peak 1152.0 MiB
BufferPool: allocated 6.1 MiB, tracked 128.0 MiB, untracked allocated 0.0 B, peak 128.0 MiB
DataWriter: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
Photon Protobuf Plan Arena: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 110.8 KiB
JsonFileScanNode(id=8883, output_schema=[string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 3 more]): allocated 256.0 MiB, tracked 1024.0 MiB, untracked allocated 0.0 B, peak 1024.0 MiB
JniReader: allocated 1984.0 B, tracked 1984.0 B, untracked allocated 0.0 B, peak 1984.0 B
SimdJsonReader: allocated 256.0 MiB, tracked 1024.0 MiB, untracked allocated 0.0 B, peak 1024.0 MiB
JSON buffer: allocated 256.0 MiB, tracked 256.0 MiB, untracked allocated 0.0 B, peak 256.0 MiB
simdjson internal usage: allocated 0.0 B, tracked 768.0 MiB, untracked allocated 0.0 B, peak 768.0 MiB
ProjectNode(id=8893, output_schema=[string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 3 more]): allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
ProjectNode(id=8908, output_schema=[string, struct<string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 4 more>]): allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
SortNode(id=8911, output_schema=[string, struct<string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, ... 4 more>]): allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
Sorter: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
spilled run buffers: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
output batch var len data: allocated 0.0 B, tracked 0.0 B, untracked allocated 0.0 B, peak 0.0 B
Memory consumers:
Acquired by com.databricks.photon.NativeMemoryConsumer@9cc6126: 1152.0 MiB

Databricks Jobs & Pipelines: Serverless SparkOutOfMemoryError while reading 500mb json file