โ06-23-2021 01:07 PM
Consider a basic structured streaming use case of aggregating the data, perform some basic data cleaning transformation, and merge into a historical aggregate dataset.
โ06-23-2021 06:20 PM
what I can think of is
1. put trigger processing tโime some interval rather than continuos.The Api hit of checkpoint storage increase cost,not dbus but for cloud vendor
2.If you have multiple streams then multiplex multiple streams into one,rather than different cluster for different streams.
โ
โ
โ06-24-2021 02:06 AM
โ12-10-2021 06:41 AM
This will help a lot pls ensure we follow these before moving to production
https://docs.databricks.com/spark/latest/structured-streaming/production.html
โ12-21-2022 07:10 PM
I second the recommendations: auto load with trigger, batch processing instead of continuous streaming where use case permits. In addition,
โ12-26-2022 04:28 AM
Yes correct oneโ
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group