Stream processing large number of JSON files and handling exception
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12 hours ago
- application writes several JSON (small) files and the expected volumes of these files are high ( Estimate: 1 million during the peak season in a hourly window) . As per current design, these files are streamed through Spark Stream and we use autoloader to load these files.
- Can Databricks bronze Job handle the volume without any failures loading to Bronze table. ?
- Reading several forms appears streaming a large parquet file is not ideal .
- Handling Stream Processing failures through quarantine process
- Any exception occurring we would like to write those records for operation’s review. Are there any best practices / reference material with example for handling the same?
0 REPLIES 0

