cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Init Scripts Error When Deploying a Delta Live Table Pipeline with Databricks Asset Bundles

jorperort
Contributor

Hello everyone,

Let me give you some context. I am trying to deploy a Delta Live Table pipeline using Databricks Asset Bundles, which requires a private library hosted in Azure DevOps.

As far as I understand, this can be resolved in three ways:

  1. Installing it directly in the notebook using %install,... (recommended by Databricks).
  2. Using init scripts.
  3. Using cluster policies with init scripts.

The first option works fine. Now, I am testing options 2 and 3, but I am encountering errors in both.

Focusing on option 2 (init scripts), here is what I have done:

  • I created an init script in a Unity Catalog volume.

  • I added it to the allowlist to avoid permission errors.

  • To test its execution, I simplified the script to the following content:

    bash
     

 

 

 

#!/bin/bash
echo "The initialization script is running correctly."

 

 

 

  • After uploading it to the bundle, in the clusters section, I defined the following:

    yaml
     

 

 

 

clusters:
  - label: default
    node_type_id: Standard_DS3_v2
    autoscale:
      min_workers: 1
      max_workers: 1
      mode: ENHANCED
    spark_env_vars:
      PIP_INDEX_URL: xxx
    init_scripts:
      - volumes:
          destination: /Volumes/system_metadata/init_scripts/cluster_dependencies.sh
libraries:
...

 

 

 

The deployment completes successfully, but when executing, I get the following error:

 

 

 

 

 

com.databricks.pipelines.common.errors.deployment.DeploymentException: Failed to launch pipeline cluster xxx: Init scripts failed. instance_id: xxx, databricks_error_message: Cluster scoped init script /Volumes/system_metadata/init_scripts/cluster_dependencies... This error is likely due to a misconfiguration in the pipeline. Check the pipeline cluster configuration and associated cluster policy.

 

 

 

The pipeline does not have any associated cluster policy. However, in the compute section, there are several policies created.

 

 

 

resources:
  pipelines:
    pipeline_dev_x:
      name: "x pipeline"
      clusters:
        - label: default
          node_type_id: Standard_DS3_v2
          spark_env_vars:
            PIP_INDEX_URL: xxx
          init_scripts:
            - volumes:
                destination: /Volumes/system_metadata/init_scripts/cluster_dependencies.sh
          autoscale:
            min_workers: 1
            max_workers: 1
            mode: ENHANCED
      libraries:
        - notebook:
            path: xxx
      schema: xxx
      development: true
      channel: PREVIEW
      catalog: xxx
      deployment:
        kind: BUNDLE
        metadata_file_path: xxx

 

 

 

  1. Can these cluster policies be applied automatically to the pipeline without explicitly referencing them?
  2. I have reviewed the policies, and I don’t see any that could interfere, but has anyone faced this error before?
  3. Has anyone tackled this issue using init scripts or cluster policies (executing the init script from a policy)?
1 ACCEPTED SOLUTION

Accepted Solutions

jorperort
Contributor

I detected the error; it was due to the path defined in the bundle where the init script was located.

I'm closing the post.

View solution in original post

3 REPLIES 3

jorperort
Contributor

I detected the error; it was due to the path defined in the bundle where the init script was located.

I'm closing the post.

116680
New Contributor II

Hi,

I also face the same issue. Can you elaborate how the path defined is a root cause? 

Hello @116680,

The problem I had was with the name of the startup script that I had uploaded to the volume and had the correct permissions. The name didn't match the one defined in the DAB resource that creates the DLT pipeline.