scheduling in dlt pipeline

rvakr
New Contributor II

Hi Team, 

when i creaing DLT pipelines i am not able add scheduls via assed bundle,

its allowing me from UI only
is there any other option make dynamic way to create schdules like using SDK | API | CLI

pradeep_singh
Contributor III

This seems like a know limitation at the moment . Similar post from prior discussion and workaround as follows 

https://community.databricks.com/t5/data-engineering/databricks-asset-bundles-triggering-delta-live-...

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

davidmorton
Databricks Employee
Databricks Employee

Unfortunately, you can't set it in YAML directly. 

A workaround for this would be to create the pipeline as a triggered pipeline, and then create a job that will trigger the pipeline periodically. Unlike pipelines, job schedules can be defined using the YAML. 

Something like the following would probably get the job done:

# resources/my_pipeline_job.yml
resources:
  jobs:
    scheduled_dlt_pipeline:
      name: "Scheduled DLT Pipeline Job"
      schedule:
        quartz_cron_expression: "0 0 8 * * ?"  # Daily at 8 AM
        timezone_id: "America/Chicago"
        pause_status: "UNPAUSED"
      tasks:
        - task_key: run_dlt_pipeline
          pipeline_task:
            pipeline_id: ${resources.pipelines.my_dlt_pipeline.id}
      email_notifications:
        on_failure:
          - data-team@company.com

 

pradeep_singh
Contributor III

Using workflows to trigger DLT pipelines give you couple of other flexibilities as well so I would recommend that approach.

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

pradeep_singh
Contributor III

If you still want to use sdk here is what you can do . update the job using sdk

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

def update_job_schedule(job_id: int):
    return w.jobs.update(
        job_id=job_id,
        new_settings={
            "schedule": {
                "quartz_cron_expression": "0 0 1 * * ?",
                "timezone_id": "UTC",
                "pause_status": "UNPAUSED",
            }
        },
    )

result = update_job_schedule(1234567890)
print(result)

 

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

View solution in original post

rvakr
New Contributor II

@pradeep_singh  Thanks for update