- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-20-2021 01:51 PM
I have an always-on job cluster triggering Spark Streaming jobs. I would like to stop this streaming job once a week to run table maintenance. I was looking to leverage the foreachBatch function to check a condition and stop the job accordingly.
- Labels:
-
Foreachbatch
-
Job clusters
-
Streaming spark
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-20-2021 09:44 AM
Hi @Nolan Lavender , For e.g. if want to stop streaming on Saturday, you could do something like the below. Below is just a pseudo code.
.foreachBatch{ (batchDF: DataFrame, batchId: Long) =>
if (date_format(current_timestamp(), "u") == 6) { //run commands to maintain the table }
Alternatively, You can calculate approximately how many micro batches are processed in a week and then you can periodically stop the streaming job. If your streaming is processing 100 microbatches in a week, then you can do something like below.
.foreachBatch{ (batchDF: DataFrame, batchId: Long) =>
if (batchId % 101 == 0) { //run commands to maintain the table }
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-20-2021 09:44 AM
Hi @Nolan Lavender , For e.g. if want to stop streaming on Saturday, you could do something like the below. Below is just a pseudo code.
.foreachBatch{ (batchDF: DataFrame, batchId: Long) =>
if (date_format(current_timestamp(), "u") == 6) { //run commands to maintain the table }
Alternatively, You can calculate approximately how many micro batches are processed in a week and then you can periodically stop the streaming job. If your streaming is processing 100 microbatches in a week, then you can do something like below.
.foreachBatch{ (batchDF: DataFrame, batchId: Long) =>
if (batchId % 101 == 0) { //run commands to maintain the table }
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-27-2024 08:27 PM
You could also use the "Available-now micro-batch" trigger. It only processes one batch at a time, and you can do whatever you want in between batches (sleep, shut down, vacuum, etc.)

