cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Using Init Scipt to execute python notebook at all-purpose cluster level

prashant151
New Contributor II

Hi

We have setup.py in my databricks workspace.

This script is executed in other transformation scripts using

%run /Workspace/Common/setup.py

which consume lot of time.

 

This setup.py internally calls other utilities notebooks using %run

%run /Workspace/Common/01_Utilities.py
%run /Workspace/Common/02_Utilities.py

We are trying to run setup.py. at cluster level. Currently shell files are allowed in init scripts.

Please help how we can execute this setup.py at cluster level and we can remove execution of this notebook in rest of notebooks.

@Advika 

1 REPLY 1

Raman_Unifeye
Contributor III

@prashant151 - Unlike legacy (pre-UC) clusters, you cannot directly run a Databricks notebook (like setup.py) from a cluster init script, because init scripts only support shell commands — not %run or notebook execution.

You will need to refactor your setup logic into a Python module and install it via the init script.

I would do below instead

  • Refactor steup.py into python package (.whl)
  • Store your .whl to UC Volume
  • Install via Cluster Init Script (Attach this init script to your cluster under Advanced Options → Init Scripts)
  • Now you can import it in your notebook instead of %run

 


RG #Driving Business Outcomes with Data Intelligence