cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Add policy init_scripts.*.volumes.destination for dlt not working

costi9992
New Contributor III

Hi,

I tried to create a policy to use it for DLTs that are ran with shared clusters, but when i run the DLT with this policy I have an error. Init-script is added to Allowed JARs/Init Scripts.

DLT events error:

 

Cluster scoped init script /Volumes/main/default/bp-lakehouse-monitor/script/test-dlt.sh failed: Script exit status is non-zero

 

Cluster init-scripts logs error:

 

bash: line 11: /Volumes/main/default/bp-lakehouse-monitor/script/init_script.sh: Invalid argument

 

Script content:

 

#!/usr/bin/env bash
echo "Loading cluster init script"

 

Policy Family: Shared Compute
Policy Overrides:

 

{
"cluster_type": {
  "type": "fixed",
  "value": "dlt"
},
"init_scripts.0.volumes.destination": {
  "type": "fixed",
  "value": "/Volumes/main/default/bp-lakehouse-monitor/script/test-dlt.sh"
},
"cluster_log_conf.path": {
  "type": "fixed",
  "value": "dbfs:/cluster-logs/dlt/"
},
"cluster_log_conf.type": {
  "type": "fixed",
  "value": "DBFS"
}
}

 

Was anyone able to run this scenario successfully ? Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions

costi9992
New Contributor III

Found the issue finally, when pipeline is configured ( created /edited ) in UC workspace, in order to use it with volumes, in "Advanced" section, "Channel" must be set to preview. It wasn't an issue caused by the content of the init script.  

View solution in original post

6 REPLIES 6

Bogdan
Databricks Employee
Databricks Employee

Sounds like init script is being run but it fails. The failure is coming from init_script.sh which is called by test-dlt.sh. So it's something specific to these scripts. The script content does not appear in the post (only the first 2 lines), so I can't tell what is wrong.

costi9992
New Contributor III

Actually thats the entire script, 

First i had another script ( which for clusters/jobs works fine, but is not set from policy, but from dbx ui ), but after i saw that error, i thought that it might be something regarding the script and in order to check that i deleted the script content and just add an `echo` in it . And the same error occurred, saying that in line 11 there is an issue, but my script has only two lines ( the ones from the issue, thats all script content ). I double checked the script a couple of times just to be sure the correct content is in volumes uploaded file.

Bogdan
Databricks Employee
Databricks Employee

Still, the error is coming from init_script.sh, so somehow it still ends up being run, even though test-dlt.sh does not reference it. Maybe the cluster already has init_script.sh attached or there is a global script? You could also try to remove the cluster policy init script completely to confirm if it's related to it or not.

costi9992
New Contributor III

Found the issue finally, when pipeline is configured ( created /edited ) in UC workspace, in order to use it with volumes, in "Advanced" section, "Channel" must be set to preview. It wasn't an issue caused by the content of the init script.  

ayush007
New Contributor II

@costi9992I am facing same issue with UC enabled cluster with 13.3 Databricks Runtime.I have uploaded the init shell script in Volume with particular init script allowed by metastore admin.But I get the same error as you stated .When I looked in cluster logs it states the file doesn't exist.But In cluster (Advanced Options -> Init scripts(Preview))I selected the correct script.Please can you help.

costi9992
New Contributor III

Your problem is with All Purpose Compute or with Delta Live Tables ? 

For DLT in order to have it working with volumes, you must do two things in DLT Settings.

1 - Destination should point to `Unity Catalog`
2 - Channel must be set on `Preview`  (in the bottom of the page ) 

For All Purpose Cluster i didn't used policies, but i was able to load init scripts from volumes. 
Your All Purpose Cluster must have the AccessMode = `Shared`. 


Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group