cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

scheduling in dlt pipeline

rvakr
New Contributor II

Hi Team, 

when i creaing DLT pipelines i am not able add scheduls via assed bundle,

its allowing me from UI only
is there any other option make dynamic way to create schdules like using SDK | API | CLI

1 ACCEPTED SOLUTION

Accepted Solutions

dbxdev
New Contributor II

If you still want to use sdk here is what you can do . update the job using sdk

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

def update_job_schedule(job_id: int):
    return w.jobs.update(
        job_id=job_id,
        new_settings={
            "schedule": {
                "quartz_cron_expression": "0 0 1 * * ?",
                "timezone_id": "UTC",
                "pause_status": "UNPAUSED",
            }
        },
    )

result = update_job_schedule(1234567890)
print(result)

 

View solution in original post

4 REPLIES 4

dbxdev
New Contributor II

This seems like a know limitation at the moment . Similar post from prior discussion and workaround as follows 

https://community.databricks.com/t5/data-engineering/databricks-asset-bundles-triggering-delta-live-...

davidmorton
Databricks Employee
Databricks Employee

Unfortunately, you can't set it in YAML directly. 

A workaround for this would be to create the pipeline as a triggered pipeline, and then create a job that will trigger the pipeline periodically. Unlike pipelines, job schedules can be defined using the YAML. 

Something like the following would probably get the job done:

# resources/my_pipeline_job.yml
resources:
  jobs:
    scheduled_dlt_pipeline:
      name: "Scheduled DLT Pipeline Job"
      schedule:
        quartz_cron_expression: "0 0 8 * * ?"  # Daily at 8 AM
        timezone_id: "America/Chicago"
        pause_status: "UNPAUSED"
      tasks:
        - task_key: run_dlt_pipeline
          pipeline_task:
            pipeline_id: ${resources.pipelines.my_dlt_pipeline.id}
      email_notifications:
        on_failure:
          - data-team@company.com

 

dbxdev
New Contributor II

Using workflows to trigger DLT pipelines give you couple of other flexibilities as well so I would recommend that approach.

dbxdev
New Contributor II

If you still want to use sdk here is what you can do . update the job using sdk

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

def update_job_schedule(job_id: int):
    return w.jobs.update(
        job_id=job_id,
        new_settings={
            "schedule": {
                "quartz_cron_expression": "0 0 1 * * ?",
                "timezone_id": "UTC",
                "pause_status": "UNPAUSED",
            }
        },
    )

result = update_job_schedule(1234567890)
print(result)