cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to stop a Streaming Job based on time of the week

nolanlavender00
New Contributor

I have an always-on job cluster triggering Spark Streaming jobs. I would like to stop this streaming job once a week to run table maintenance. I was looking to leverage the foreachBatch function to check a condition and stop the job accordingly.

1 ACCEPTED SOLUTION

Accepted Solutions

mathan_pillai
Valued Contributor
Valued Contributor

Hi @Nolan Lavender​ , For e.g. if want to stop streaming on Saturday, you could do something like the below. Below is just a pseudo code.

.foreachBatch{ (batchDF: DataFrame, batchId: Long) =>

if (date_format(current_timestamp(), "u") == 6) { //run commands to maintain the table }

Alternatively, You can calculate approximately how many micro batches are processed in a week and then you can periodically stop the streaming job. If your streaming is processing 100 microbatches in a week, then you can do something like below.

.foreachBatch{ (batchDF: DataFrame, batchId: Long) =>

if (batchId % 101 == 0) { //run commands to maintain the table }

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @Nolan Lavender​ ! My name is Kaniz, and I'm a technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers on the Forum have an answer to your questions first. Or else I will follow up shortly with a response.

mathan_pillai
Valued Contributor
Valued Contributor

Hi @Nolan Lavender​ , For e.g. if want to stop streaming on Saturday, you could do something like the below. Below is just a pseudo code.

.foreachBatch{ (batchDF: DataFrame, batchId: Long) =>

if (date_format(current_timestamp(), "u") == 6) { //run commands to maintain the table }

Alternatively, You can calculate approximately how many micro batches are processed in a week and then you can periodically stop the streaming job. If your streaming is processing 100 microbatches in a week, then you can do something like below.

.foreachBatch{ (batchDF: DataFrame, batchId: Long) =>

if (batchId % 101 == 0) { //run commands to maintain the table }

Kaniz
Community Manager
Community Manager

Hi @Nolan Lavender​, How is it going?

Were you able to resolve your problem?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.