- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
I saw how to schedule a workflow using UI but python script, can someone help me to find how to schedule workflow hourly in python script ? Thank you.
- Labels:
-
Workflows
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
You can use databricks sdk or databricks rest api to achieve this
Databricks sdk - in the backend uses API only but it is more secure. I will share you the links to both , you can choose according to your usecase
Databricks api
- If the job is already created you want to update it to add schedule : https://docs.databricks.com/api/workspace/jobs/update#new_settings-schedule
- if you want to create a complete new job : https://docs.databricks.com/api/workspace/jobs/create#schedule
databricks sdk
- If the job is already created you want to update it to add schedule : You can using list function and get your workflow and then or you can directly use the update function and send the job_params in there.
- if you want to create a new job : you can use the create function
Using SDK method is a little bit complex bcz you will need to find the right set of attributes and functions to use but worth trying you can even send the link to llm and ask it help.
https://databricks-sdk-py.readthedocs.io/en/latest/workspace/jobs/jobs.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago - last edited 2 weeks ago
You can update your dab file (databricks.yaml) with corn syntax as below under jobs.
resources:
jobs:
hello-job:
name: hello-job
tasks:
- task_key: hello-task
existing_cluster_id: 1234-567890-abcde123
notebook_task:
notebook_path: ./hello.py
schedule:
quartz_cron_expression: "0 0 * * * ?"
timezone_id: "UTC"
pause_status: UNPAUSED
Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Hey @bricks3,
Exactly, as far as I know you define the workflow configuration in the YAML file, and under the hood, DABS handles the API calls to Databricks (including scheduling).
To run your workflow hourly, you just need to include the schedule block inside your DABS YAML definition like this:
workflows:
my_workflow:
name: "My Hourly Job"
tasks:
- task_key: "main_task"
notebook_task:
notebook_path: "/Workspace/Path/To/Notebook"
job_cluster_key: "cluster"
schedule:
quartz_cron_expression: "0 0 * * * ?"
timezone_id: "UTC"
pause_status: "UNPAUSED"
Thanks should be all 🙂
Isi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Hey @bricks3
If you’re looking to schedule a workflow to run hourly using Python, here’s some clarification and guidance:
To create and schedule a new workflow programmatically, you should use the API.
If you want to create a new job and include the hourly schedule, use this:
POST/api/2.2/jobs/create
This lets you define the job and its scheduling in one go.
If the job already exists and you simply want to add or modify the schedule, use this:
POST /api/2.2/jobs/update
This endpoint allows you to update an existing job
The scheduling configuration uses Quartz cron expressions. For an hourly schedule, you can use:
"schedule": { "quartz_cron_expression": "20 30 * * * ?", "timezone_id": "Europe/London", "pause_status": "UNPAUSED" }
If you’re using the Databricks UI:
Go to Workflows, and then in the right panel click “Schedule and Workflows”. There you can select the Schedule interval and configure it to run hourly, daily, etc., using the graphical interface.
Hope this helps, 🙂
Isi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
You can use databricks sdk or databricks rest api to achieve this
Databricks sdk - in the backend uses API only but it is more secure. I will share you the links to both , you can choose according to your usecase
Databricks api
- If the job is already created you want to update it to add schedule : https://docs.databricks.com/api/workspace/jobs/update#new_settings-schedule
- if you want to create a complete new job : https://docs.databricks.com/api/workspace/jobs/create#schedule
databricks sdk
- If the job is already created you want to update it to add schedule : You can using list function and get your workflow and then or you can directly use the update function and send the job_params in there.
- if you want to create a new job : you can use the create function
Using SDK method is a little bit complex bcz you will need to find the right set of attributes and functions to use but worth trying you can even send the link to llm and ask it help.
https://databricks-sdk-py.readthedocs.io/en/latest/workspace/jobs/jobs.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
@Isi @ashraf1395 Thank you for your reply, I am using dabs, how to use this configuration in dabs ? I can not edit the workflow in webui, I want to use this configuration in dabs yaml files, I think dabs uses terraform and terraform calls this api if I am right.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago - last edited 2 weeks ago
You can update your dab file (databricks.yaml) with corn syntax as below under jobs.
resources:
jobs:
hello-job:
name: hello-job
tasks:
- task_key: hello-task
existing_cluster_id: 1234-567890-abcde123
notebook_task:
notebook_path: ./hello.py
schedule:
quartz_cron_expression: "0 0 * * * ?"
timezone_id: "UTC"
pause_status: UNPAUSED
Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Hey @bricks3,
Exactly, as far as I know you define the workflow configuration in the YAML file, and under the hood, DABS handles the API calls to Databricks (including scheduling).
To run your workflow hourly, you just need to include the schedule block inside your DABS YAML definition like this:
workflows:
my_workflow:
name: "My Hourly Job"
tasks:
- task_key: "main_task"
notebook_task:
notebook_path: "/Workspace/Path/To/Notebook"
job_cluster_key: "cluster"
schedule:
quartz_cron_expression: "0 0 * * * ?"
timezone_id: "UTC"
pause_status: "UNPAUSED"
Thanks should be all 🙂
Isi

