topic Spark verison errors in "Build an ETL pipeline with Lakeflow Spark Declarative Pipelines" in Data Engineering

Spark verison errors in "Build an ETL pipeline with Lakeflow Spark Declarative Pipelines"

liquibricks — Mon, 29 Dec 2025 11:26:32 GMT

I'm trying to define a job for a pipeline using the Asset Bundle Python SDK. I created the pipeline first (using the SDK) and i'm now trying to add the Job. The DAB validates and deploys successfully, but when I run the Job i get an error:

UNAUTHORIZED_ERROR: User <some-guid> does not have Run permissions on pipeline None.

How can I define the job to link to the already existing pipeline (which is already running in Continuous mode)?

The DAB code is as follows:

my_pipeline = Pipeline( name = "My Pipeline", catalog = "mycatalog", schema = "default", continuous=True, clusters = [ PipelineCluster( ... ) ], libraries = [ PipelineLibrary( file=FileLibrary(path="src/my_sdp.py") ) ] ) my_task = Task( task_key="My_pipeline_task", pipeline_task=PipelineTask( pipeline_id=str(my_pipeline.id) ) ) my_job = Job( name="My Pipeline Job", tasks=[ my_task ] )

Re: Spark verison errors in "Build an ETL pipeline with Lakeflow Spark Declarative Pipelines&qu

emma_s — Mon, 29 Dec 2025 11:51:32 GMT

Hi, if you've already created the pipeline you don't need to create it again via the DAB, just get the pipeline id from the UI and pass that into your job. Also your syntax for the task and the job should be more like this:

jobs: my_pipeline_job: name: my-pipeline-job tasks: - task_key: my-pipeline-task pipeline_task: pipeline_id: [pass your the pipeline of existing pipeline here]

Re: Spark verison errors in "Build an ETL pipeline with Lakeflow Spark Declarative Pipelines&qu

ethanop — Mon, 29 Dec 2025 12:54:58 GMT

The error happens because my_pipeline.id does not exist when the Asset Bundle is defined. Resource IDs are only created after deployment, so your job is effectively created with pipeline_id = None. When the job runs, Databricks tries to run a pipeline with ID None, which results in the “Run permissions on pipeline None” error.

In Databricks Asset Bundles, you must link resources symbolically, not by accessing their IDs directly in Python.

To fix this, reference the pipeline using the bundle resource reference syntax:

my_task = Task( task_key="My_pipeline_task", pipeline_task=PipelineTask( pipeline_id="${resources.pipelines.my_pipeline.id}" ) )

Here, my_pipeline is the Python variable name used when defining the Pipeline resource. Databricks resolves this reference to the actual pipeline ID at deploy time.

Your job definition can then remain unchanged.

One important note: because your pipeline is running in continuous mode, triggering it from a job will restart the pipeline each time the job runs. If you don’t need scheduled restarts or orchestration with other tasks, you may not need a job at all, just deploying the pipeline is sufficient.

Key takeaway: never use .id directly in Asset Bundle code. Always use ${resources.<type>.<name>.id} to link bundle-managed resources.

Re: Spark verison errors in "Build an ETL pipeline with Lakeflow Spark Declarative Pipelines&qu

mukul1409 — Mon, 29 Dec 2025 16:05:19 GMT

This happens because the job is not actually linked to the deployed pipeline and the pipeline id is None at runtime. When using Asset Bundles, the pipeline id is only resolved after deployment, so referencing my_pipeline.id in code does not work. Instead, the job must reference the pipeline using the bundle resource reference, not a Python variable. You should define the pipeline and job as bundle resources and set the pipeline task pipeline id to the bundle reference for that pipeline. Also ensure that the job owner has Run permission on the pipeline. Once the job correctly references the deployed pipeline resource and permissions are in place, the unauthorized error will be resolved.