06-23-2021 01:07 PM
Consider a basic structured streaming use case of aggregating the data, perform some basic data cleaning transformation, and merge into a historical aggregate dataset.
06-23-2021 06:20 PM
what I can think of is
1. put trigger processing time some interval rather than continuos.The Api hit of checkpoint storage increase cost,not dbus but for cloud vendor
2.If you have multiple streams then multiplex multiple streams into one,rather than different cluster for different streams.
06-24-2021 02:06 AM
12-10-2021 06:41 AM
This will help a lot pls ensure we follow these before moving to production
https://docs.databricks.com/spark/latest/structured-streaming/production.html
12-21-2022 07:10 PM
I second the recommendations: auto load with trigger, batch processing instead of continuous streaming where use case permits. In addition,
12-26-2022 04:28 AM
Yes correct one
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group