Databricks Community

jorperort · ‎02-25-2025

Hello everyone,

Let me give you some context. I am trying to deploy a Delta Live Table pipeline using Databricks Asset Bundles, which requires a private library hosted in Azure DevOps.

As far as I understand, this can be resolved in three ways:

Installing it directly in the notebook using %install,... (recommended by Databricks).
Using init scripts.
Using cluster policies with init scripts.

The first option works fine. Now, I am testing options 2 and 3, but I am encountering errors in both.

Focusing on option 2 (init scripts), here is what I have done:

I created an init script in a Unity Catalog volume.
I added it to the allowlist to avoid permission errors.
To test its execution, I simplified the script to the following content:
bash

#!/bin/bash
echo "The initialization script is running correctly."

After uploading it to the bundle, in the clusters section, I defined the following:
yaml

clusters:
  - label: default
    node_type_id: Standard_DS3_v2
    autoscale:
      min_workers: 1
      max_workers: 1
      mode: ENHANCED
    spark_env_vars:
      PIP_INDEX_URL: xxx
    init_scripts:
      - volumes:
          destination: /Volumes/system_metadata/init_scripts/cluster_dependencies.sh
libraries:
...

The deployment completes successfully, but when executing, I get the following error:

com.databricks.pipelines.common.errors.deployment.DeploymentException: Failed to launch pipeline cluster xxx: Init scripts failed. instance_id: xxx, databricks_error_message: Cluster scoped init script /Volumes/system_metadata/init_scripts/cluster_dependencies... This error is likely due to a misconfiguration in the pipeline. Check the pipeline cluster configuration and associated cluster policy.

The pipeline does not have any associated cluster policy. However, in the compute section, there are several policies created.

resources:
  pipelines:
    pipeline_dev_x:
      name: "x pipeline"
      clusters:
        - label: default
          node_type_id: Standard_DS3_v2
          spark_env_vars:
            PIP_INDEX_URL: xxx
          init_scripts:
            - volumes:
                destination: /Volumes/system_metadata/init_scripts/cluster_dependencies.sh
          autoscale:
            min_workers: 1
            max_workers: 1
            mode: ENHANCED
      libraries:
        - notebook:
            path: xxx
      schema: xxx
      development: true
      channel: PREVIEW
      catalog: xxx
      deployment:
        kind: BUNDLE
        metadata_file_path: xxx

Can these cluster policies be applied automatically to the pipeline without explicitly referencing them?
I have reviewed the policies, and I don’t see any that could interfere, but has anyone faced this error before?
Has anyone tackled this issue using init scripts or cluster policies (executing the init script from a policy)?

jorperort · ‎02-25-2025

I detected the error; it was due to the path defined in the bundle where the init script was located.

I'm closing the post.

View solution in original post

jorperort · ‎02-25-2025

I detected the error; it was due to the path defined in the bundle where the init script was located.

I'm closing the post.

116680 · Friday

Hi,

I also face the same issue. Can you elaborate how the path defined is a root cause?

jorperort · Saturday

Hello @116680,

The problem I had was with the name of the startup script that I had uploaded to the volume and had the correct permissions. The name didn't match the one defined in the DAB resource that creates the DLT pipeline.

Databricks Community

Init Scripts Error When Deploying a Delta Live Table Pipeline with Databricks Asset Bundles

Photos

Join Us as a Local Community Builder!

Business Intelligence in the Era of AI

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Databricks Community Champion - March 2025 - Takuya Omi

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April