Databricks Community

RyHubb · ‎01-31-2024

Hello, I'm looking to create a job which is linked to a delta live table.

Given the job code like this:

    my_job_name:
      name: thejobname
      schedule:
        quartz_cron_expression: 56 30 12 * * ?
        timezone_id: UTC
        pause_status: UNPAUSED
      tasks:
        - task_key: the-task-name
          pipeline_task:
            pipeline_id: thepipelineuuidhere
            full_refresh: false

Is there a way to just use the pipeline name? In the same yml file, I have the pipeline declared. For now, I've deployed, then copied the ID and pasted it in the yml. But that won't help on a fresh install. Is there some way to dynamically get that ID?

I tried just putting the name of the pipeline, but that doesn't work (it can't find the pipeline).

Yeshwanth · ‎02-01-2024

@RyHubb

You can specify the variable of the ID and it will be materialized at deploy time. No need to do this yourself. An example is at https://github.com/databricks/bundle-examples/blob/24678f538415ab936e341a04fce207dce91093a8/default_...

Let me know if this helps

View solution in original post

Yeshwanth · ‎01-31-2024

Hey @RyHubb

See the Delta Live Tables API's create pipeline request payload reference: https://docs.databricks.com/api/workspace/pipelines/create

RyHubb · ‎02-01-2024

Yeah, I know how to create a pipeline (I have it defined in the same yaml file), but that doesn't explain how to create a JOB that kicks off a pipeline that is created in the same yaml file. You need the ID of the pipeline to create the job. As far as I know, the ID is generated when you create the pipeline. So how do I reference the pipeline's ID if I don't know what it is?