MongoDB to databricks driver killed and compute re...

DoredlaCharan · ‎02-09-2026

I started reading the data from the mongodb using the spark read it uses mongo-spark-connector, by default there will be sample size as 1000 meaning referring only 1000 documents in the collection to make them as columns in the dataframe, so i increased size to the number of documents in the collection here in my case the document has 100+ keys.

Compute used: Legacy compute

Code:

df = spark.read \

.format("mongodb") \

.option("spark.mongodb.connection.uri", mongo_url) \

.option("database", database) \

.option("collection", collection) \

.option("mergeSchema", "true")\

.option("partitioner", "MongoShardedPartitioner") \

.option("partitionerOptions.shardKey", "_id") \

.option("sampleSize", "100000")\

.load()

Error:

"The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.
	at com.databricks.spark.chauffeur.Chauffeur.onDriverStateChange(Chauffeur.scala:2035)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)"

MongoDB to databricks driver killed and compute re-attached