Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-29-2024 08:51 AM
Hi @Mathias,
I'd say that watermarking might be a good solution for your use case. Please check Control late data threshold with multiple watermark policy in Structured Streaming.
If you want to dig-in further there's also: Spark Structured Streaming Programming Guide - Handling Late Data and Watermarking.
There are other ways to achieve what you're aiming for, I think it's more of a design decision.
Best regards,
Raphael Balogo
Sr. Technical Solutions Engineer
Databricks
Raphael Balogo
Sr. Technical Solutions Engineer
Databricks