cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Stream-stream join using MongoDB sink

sowj02
New Contributor

I am performing stream-to-stream join in Databricks using MongoDB as a source (readStream()). Both sources collections receive data at same time. Initially I tried with using watermarks 

orderWithWatermark = order \
  .selectExpr("order_id AS orderId","event_created AS orderWatermark","approved_date")\
  .withWatermark("orderWatermark", "2 minutes")
orderstatusWithWatermark = orderstatus \
  .selectExpr("order_id AS orderstatusId","event_created AS orderstatusWatermark","order_date","amount")\
  .withWatermark("orderstatusWatermark", "2 minutes")
when i joined them
joined_df=orderWithWatermark.join(
  orderstatusWithWatermark,
  expr("""
   orderId = orderstatusId AND
   icb.icbWatermark >= ica.icaWatermark
    """))
and writing the joined_df using writestream to mongodb sink
I am facing an issue
[STREAM_FAILED] Query [id = 37bd39d7-dde1-4d19-8e9f-f4718c27dca4, runId = 735152c6-0398-46aa-b673-a46ac8fa1848] terminated with exception: true SQLSTATE: XXKST
I need help is there a recommended way to handle issue ?
1 REPLY 1

cgrant
Databricks Employee
Databricks Employee

There is not enough information in this high-level error message. Please expand the full stacktrace and feel free to post it here

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now