MongoDB Streaming Not Receiving Records in Databricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-14-2025 01:52 AM
Batch Read (spark.read.format("mongodb")) works fine.
Streaming Read (spark.readStream.format("mongodb")) runs but receives no records.
Batch Read (Works):
df = spark.read.format("mongodb")\
.option("database", database)\
.option("spark.mongodb.read.connection.uri", connectionString)\
.option("collection", collection)\
.schema(schema)\
.load()
Streaming Read (Not Receiving Records):
dfs = spark.readStream.format("mongodb")\
.option("database", database)\
.option("spark.mongodb.read.connection.uri", connectionString)\
.option("collection", collection)\
.schema(schema)\
.load()
Questions:
Does MongoDB require special settings to enable streaming?
Any known issues with MongoDB change streams on Databricks?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-14-2025 04:55 AM
Hello @vidya_kothavale,
MongoDB requires the use of change streams to enable streaming. Change streams allow applications to access real-time data changes without polling the database. Ensure that your MongoDB instance is configured to support change streams. Change Streams are available for replica sets and sharded clusters in MongoDB 3.6 and later
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-14-2025 05:24 AM
@Alberto_Umana Currently, only the records received after streaming started are available; the previous records are missing. Is there any additional steps required?

