When running structured streaming jobs in production, what are the general best practices to reduce cost?
Consider a basic structured streaming use case of aggregating the data, perform some basic data cleaning transformation, and merge into a historical aggregate dataset.
- 3566 Views
- 5 replies
- 2 kudos
Latest Reply
I second the recommendations: auto load with trigger, batch processing instead of continuous streaming where use case permits. In addition, test with a small batch firstfavor fewer larger workers over more smaller workersadjust your job cluster over...
- 2 kudos