cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Workflow scheduling on particular working day of the month in ADB

maurya_vish24
New Contributor

Hi,

I am looking to schedule a workflow to execute on 3rd working day. Working day here would be Mon-Fri of each month. I could not find any direct crontab solution but have created watcher file solution for it. Below code will create a watcher file at every 3rd working day of the month and this code is added as a task in workflow and subsequent tasks in the workflow will be dependent on it. The problem in this solution is that I will have to schedule this workflow at least 4-5 times every month because 8th working day will fall in between 9-14th of every month and out of these 4-5 times only 1 time it would succeed and rest 4 will appear as failed job in workflow history which is never a good design. Can somebody help me here? It would be great help.

 

#code to check if the calander date is working day 9 or not

from datetime import datetime, timedelta
import calendar
import sys
 
today = datetime.today()
year = today.year
month = today.month
 
# Count business days from start of month until today
business_day_count = 0
current = datetime(year, month, 1)
 
while current <= today:
    if current.weekday() < 5:  # 0=Mon ... 4=Fri
        business_day_count += 1
    current += timedelta(days=1)
 
print(f"Today is business day #{business_day_count}")
 
if business_day_count == 3:
    print("Today is the rd working day โ†’ Continue workflow")
    date_stamp = datetime.now().strftime("%Y%m%d")
    # Create file path with timestamp
    file_path = f"/mnt/refined/workday_check/pipeline_log_{date_stamp}.txt"
    # Create empty file using dbutils.fs.put
    dbutils.fs.put(file_path, "")
    #dbutils.notebook.exit("STOP")
else:
    print("Not the 3rd working day โ†’ Stop workflow")
    dbutils.notebook.exit("STOP")
 
Thanks,
Vishal
2 REPLIES 2

bianca_unifeye
New Contributor III

Hi Vishal 

Youโ€™re right: thereโ€™s no single Quartz cron expression that says โ€œrun on the 3rd working day (Monโ€“Fri) of every monthโ€. Quartz can handle โ€œNth weekday of monthโ€ (like 3#1 = first Wednesday), but not โ€œNth business day regardless of weekdayโ€, so you do need a bit of logic around it

The good news: you donโ€™t need 4โ€“5 schedules or a bunch of failed runs in history. You can:

  1. Schedule the workflow every weekday, and

  2. Use a small โ€œgateโ€ task to decide whether to continue or skip that day

  3. Make non-3rd-working-day runs show as Succeeded & Skipped, not Failed.

Have a look here

https://stackoverflow.com/questions/78661662/i-would-like-to-create-a-cron-schedule-in-databricks-th...

 

1. Schedule once: every weekday

In the job trigger, use a weekday schedule, for example:

  • In the UI: Advanced schedule โ†’ Every Monโ€“Fri at 09:00

  • Cron example: 0 0 9 ? * MON-FRI

This way the job runs daily on business days only. 

2. Gate task: compute the 3rd business day and exit cleanly

Make your current notebook the first task in the workflow (e.g. check_3rd_workday).
Slightly tweak it so that:

  • It sets a task value with the result

  • It exits successfully on non-3rd days (no exception), so the run is green, not red.

Example:

 

 
from datetime import datetime, timedelta today = datetime.today() year, month = today.year, today.month # Count business days from start of month until today business_day_count = 0 current = datetime(year, month, 1) while current <= today: if current.weekday() < 5: # 0=Mon ... 4=Fri business_day_count += 1 current += timedelta(days=1) is_third_workday = (business_day_count == 3) print(f"Today is business day #{business_day_count}") print(f"is_third_workday = {is_third_workday}") # Expose result to the workflow dbutils.jobs.taskValues.set(key="is_third_workday", value=is_third_workday) if not is_third_workday: print("Not the 3rd working day โ†’ skipping downstream tasks") dbutils.notebook.exit("SKIP") # exits SUCCESSFULLY else: print("3rd working day โ†’ continue workflow") dbutils.notebook.exit("RUN")

Key points:

  • dbutils.notebook.exit(...) ends the task as success (unless you raise an exception).

  • dbutils.jobs.taskValues.set(...) lets downstream tasks read is_third_workday. 

So your run history will show this task as green every weekday; the โ€œnon-3rdโ€ days are just short runs that exit early.

3. Only run the โ€œrealโ€ pipeline when is_third_workday == true

You have two main options:

Option A โ€“ If/else condition task (recommended if available)

If your workspace has the If/else condition task:

  1. First task: check_3rd_workday (the notebook above).

  2. Second task: type = If/else condition, expression like:
    {{tasks.check_3rd_workday.values.is_third_workday}} == "true"

  3. Attach your real workflow tasks to the true branch only.

Result:

  • On the 3rd working day: condition = true โ†’ branch with your main pipeline runs.

  • Other days: condition = false โ†’ that branch stays skipped; entire job shows as succeeded with most tasks greyed out.

Poorva21
Visitor

Use dbutils notebook exit("SKIP") instead of exiting with an error.

In Databricks Workflows:

EXIT with "SKIP" โ†’ treated as SKIPPED

EXIT with "STOP" or raising an exception โ†’ counted as FAILED

Modify your code like this:

if business_day_count == 3:

print("Today is the 3rd working day โ†’ Continue workflow")

else:

print("Not the 3rd working day โ†’ Skip workflow")

dbutils.notebook.exit("SKIP")

Now your workflow can be scheduled every weekday without generating failures.

This is the simplest fix.