What are Best Practices for Spark streaming in Databricks
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-14-2021 03:15 PM
What are best practices for Spark streaming in Databricks
- is it good idea to consume multiple topics in one streaming job
- is Auto scaling recommended for spark streaming
- How many worker nodes we should choose for streaming job
- When should we run OPTIMIZE for continuously streaming topics
- any other things to consider to implement streaming jobs with high throughput
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 05:58 AM
What are best practices for Spark streaming in Databricks
- is it good idea to consume multiple topics in one streaming job - Yes it is fine, we can create a fair pool and provide the infrastictrue to each stream so that it does not intervene between each other
- is Auto scaling recommended for spark streaming - Nope
- How many worker nodes we should choose for streaming job -Per partition one core
- When should we run OPTIMIZE for continuously streaming topics - Any time
- any other things to consider to implement streaming jobs with high throughput - Compute VM s are preferred as node
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 10:37 AM
See our docs for other considerations when deploying a production streaming job.