Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-27-2025 08:59 AM
To write a Delta table to MongoDB, you'll need to:
- Read the Delta table using PySpark or Pandas.
- Convert the data into a format MongoDB can accept (typically JSON or a dictionary).
- Use a MongoDB client (like pymongo) to insert the data.
Sample code:
from pyspark.sql import SparkSession
from pymongo import MongoClient
# Step 1: Initialize Spark session
spark = SparkSession.builder \
.appName("DeltaToMongo") \
.config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
.config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \
.getOrCreate()
# Step 2: Read Delta table
delta_df = spark.read.format("delta").load("/path/to/delta/table")
# Step 3: Convert to Pandas DataFrame
pandas_df = delta_df.toPandas()
# Step 4: Connect to MongoDB
client = MongoClient("mongodb://localhost:27017/")
db = client["your_database"]
collection = db["your_collection"]
# Step 5: Insert data into MongoDB
collection.insert_many(pandas_df.to_dict("records"))