mongodb spark

seefoods · ‎09-18-2025

Hello Guys,

Someone know a technique to perform write for a delta table to Mongo using connection mongodb org.mongodb.spark:mongo-spark-connector_2.12:10.5.0

I have 1 bilion records to write

Thanks

szymon_dybczak · ‎09-18-2025

Hi @seefoods ,

Yes, it's well described at mongo db connector documentation page. To write data to MongoDB you need call the write function on your DataFrame object. This function returns a DataFrameWriter object, which you can use to specify the format and other configuration settings for your batch write operation.

Here's an example of how to use it:

dataFrame = spark.createDataFrame([("Bilbo Baggins",  50), ("Gandalf", 1000), ("Thorin", 195), ("Balin", 178), ("Kili", 77),
   ("Dwalin", 169), ("Oin", 167), ("Gloin", 158), ("Fili", 82), ("Bombur", None)], ["name", "age"])

dataFrame.write.format("mongodb")
               .mode("append")
               .option("database", "people")
               .option("collection", "contacts")
               .save()

One thing to notice here - MongoDB Spark Connector supports the following save modes:

append
overwrite

So, in your case just read delta table to Dataframe and use DataFrameWriter object as described above.

Write to MongoDB in Batch Mode - Spark Connector - MongoDB Docs

Edit: This connector also support streaming mode. So this is something you also can consider if you want an easy way to load data incrementally from Delta Table to Mongo

Streaming Mode - Spark Connector - MongoDB Docs

View solution in original post