cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Init Script Fails Intermittently on Workflow Job

leungi
Contributor

An init script is used to install system libraries, per below.

Adding the script to a Personal Compute consistently works. The same script is added to a Workflows job via cluster config, which intermittently fails, as shown in error message below.

Both Personal and Workflow clusters are on 14.3 LTS runtime; surprised with the instability of the latter.

Any troubleshooting advice is appreciated.

Init Script

#!/bin/bash
set -euxo pipefail
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
  sudo apt-get -y update && apt-get install -y libudunits2-dev libgdal-dev libgeos-dev libproj-dev
fi

Error

leungi_0-1718291897408.png

 

1 ACCEPTED SOLUTION

Accepted Solutions

Thanks for the suggestion @amr.

Courtesy of a DBX solution engineer, the key was to remove all the files in the /var/lib/apt/lists/ directory to force apt to download fresh package lists during subsequent update.

Init Script

#!/bin/bash
set -euxo pipefail
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
  # --- Clear cache
  rm -r /var/cache/apt/archives/* /var/lib/apt/lists/*
  sudo apt-get clean -y
  sudo apt-get update -y
  #---
  sudo apt-get -y update && apt-get install -y libudunits2-dev libgdal-dev libgeos-dev libproj-dev
fi
 

View solution in original post

2 REPLIES 2

amr
Databricks Employee
Databricks Employee

Check the cluster event log to see if there is a clue why the script is failing. if the script failed and returned none zero status the cluster wont start

Thanks for the suggestion @amr.

Courtesy of a DBX solution engineer, the key was to remove all the files in the /var/lib/apt/lists/ directory to force apt to download fresh package lists during subsequent update.

Init Script

#!/bin/bash
set -euxo pipefail
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
  # --- Clear cache
  rm -r /var/cache/apt/archives/* /var/lib/apt/lists/*
  sudo apt-get clean -y
  sudo apt-get update -y
  #---
  sudo apt-get -y update && apt-get install -y libudunits2-dev libgdal-dev libgeos-dev libproj-dev
fi
 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group