Stream-stream join using MongoDB sink
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-02-2025 10:37 PM
I am performing stream-to-stream join in Databricks using MongoDB as a source (readStream()). Both sources collections receive data at same time. Initially I tried with using watermarks
orderWithWatermark = order \
.selectExpr("order_id AS orderId","event_created AS orderWatermark","approved_date")\
.withWatermark("orderWatermark", "2 minutes")
orderstatusWithWatermark = orderstatus \
.selectExpr("order_id AS orderstatusId","event_created AS orderstatusWatermark","order_date","amount")\
.withWatermark("orderstatusWatermark", "2 minutes")
when i joined them
joined_df=orderWithWatermark.join(
orderstatusWithWatermark,
expr("""
orderId = orderstatusId AND
icb.icbWatermark >= ica.icaWatermark
"""))
and writing the joined_df using writestream to mongodb sink
I am facing an issue
[STREAM_FAILED] Query [id = 37bd39d7-dde1-4d19-8e9f-f4718c27dca4, runId = 735152c6-0398-46aa-b673-a46ac8fa1848] terminated with exception: true SQLSTATE: XXKST
I need help is there a recommended way to handle issue ?
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2025 09:00 AM
There is not enough information in this high-level error message. Please expand the full stacktrace and feel free to post it here

