Databricks Community

marcio_oliveira · ‎05-13-2025

I have several notebooks that run code to ingest data from various APIs into our Data Warehouse. I have several modules that I reuse in multiple notebooks, things like redshift functions, string cleaning functions and json cleaning functions. Out of nowhere this morning some notebooks started to randomly fail to import modules or fail to import functions from said modules.

In the bellow example the code is failing to import a function (that I confirmed exists and the name is correct)

All the Jobs are running on Serverless. When I run the same notebooks manually, there are no errors. Also, when I just click to "Repair Run" once it fails, it runs normally.

Anyone has any idea of what could possibly be happening?

lingareddy_Alva · ‎05-13-2025

Hi @marcio_oliveira

This is most likely caused by race conditions in cluster/job startup combined with dynamic module paths or
delayed workspace availability in Serverless or ephemeral job clusters. Specifically:
-- sys.path may not yet include /Workspace/Tools when the module is imported.
-- The underlying file system (e.g., DBFS mount of Workspace) might not be fully initialized at the exact moment the import is executed.
-- Workspace imports work fine in interactive sessions because the environment is already fully initialized.

Also, Databricks may have updated runtime behavior or tightened workspace initialization in recent releases,
which could expose previously hidden issues.

Fix Options
1. Use Absolute Imports with Workspace Directories
If your module is in /Workspace/Tools/json_tools.py, make sure you're importing it properly:

import sys
sys.path.append("/Workspace/Tools")
from json_tools import clean_dataframe_jsons

If this fails only sometimes, you can wrap it with a retry logic (see below).

2. Retry-Import with time.sleep (Recommended in Your Case)
Yes, you can use time.sleep(x) combined with retry logic:

import time
import sys

sys.path.append("/Workspace/Tools")

retries = 3
for i in range(retries):
try:
from json_tools import clean_dataframe_jsons
break # success
except ImportError as e:
if i < retries - 1:
print(f"Retrying import due to error: {e}")
time.sleep(2) # Delay before retrying
else:
raise

3. Move Reusable Code into a Wheel or .whl File
If possible, package your shared functions into a .whl file and install it as a library to the job cluster via %pip install or job-level configuration.
This is the most reliable and scalable solution.

Example:
python setup.py bdist_wheel

Then install:

%pip install /Workspace/Tools/my_wheel_package.whl

4. Avoid Serverless if Determinism is Critical
Switch to a non-serverless job cluster or interactive cluster if possible,
as the Workspace file system is guaranteed to be available earlier in the execution lifecycle.

LR

View solution in original post

lingareddy_Alva · ‎05-13-2025

Hi @marcio_oliveira

Thanks for sharing the error and the context — this intermittent module import issue in Databricks Serverless jobs is a known behavior in some environments,
and here’s what’s likely going wrong :

Root Cause:
A race condition or cold-start issue in serverless clusters where:
-- The notebook starts executing before the module files in /Workspace/Tools are mounted/available.
-- Python import caches may be stale or inconsistent between jobs or cluster warmups.

LR

marcio_oliveira · ‎05-13-2025

Thanks for you response, @lingareddy_Alva !
The code ran for months before starting to exhibit this behavior. Could something have changed now in Databricks?
And how do I fix this? Is a time.sleep(x) after I import the modules something that can help?

lingareddy_Alva · ‎05-13-2025

Hi @marcio_oliveira

This is most likely caused by race conditions in cluster/job startup combined with dynamic module paths or
delayed workspace availability in Serverless or ephemeral job clusters. Specifically:
-- sys.path may not yet include /Workspace/Tools when the module is imported.
-- The underlying file system (e.g., DBFS mount of Workspace) might not be fully initialized at the exact moment the import is executed.
-- Workspace imports work fine in interactive sessions because the environment is already fully initialized.

Also, Databricks may have updated runtime behavior or tightened workspace initialization in recent releases,
which could expose previously hidden issues.

Fix Options
1. Use Absolute Imports with Workspace Directories
If your module is in /Workspace/Tools/json_tools.py, make sure you're importing it properly:

import sys
sys.path.append("/Workspace/Tools")
from json_tools import clean_dataframe_jsons

If this fails only sometimes, you can wrap it with a retry logic (see below).

2. Retry-Import with time.sleep (Recommended in Your Case)
Yes, you can use time.sleep(x) combined with retry logic:

import time
import sys

sys.path.append("/Workspace/Tools")

retries = 3
for i in range(retries):
try:
from json_tools import clean_dataframe_jsons
break # success
except ImportError as e:
if i < retries - 1:
print(f"Retrying import due to error: {e}")
time.sleep(2) # Delay before retrying
else:
raise

3. Move Reusable Code into a Wheel or .whl File
If possible, package your shared functions into a .whl file and install it as a library to the job cluster via %pip install or job-level configuration.
This is the most reliable and scalable solution.

Example:
python setup.py bdist_wheel

Then install:

%pip install /Workspace/Tools/my_wheel_package.whl

4. Avoid Serverless if Determinism is Critical
Switch to a non-serverless job cluster or interactive cluster if possible,
as the Workspace file system is guaranteed to be available earlier in the execution lifecycle.

LR

Databricks Community

Job run failing to import modules

Join Us as a Local Community Builder!

🚀 Weekly Delta (8 - 14 October): A Look Back at This Week’s Top Community Highlights

Databricks Community Champion - September 2025 - Nayanjyoti Sonowal

BrickCon 2025 — Dec 3–5 | A Community Conference for Databricks Builders

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming