Random failures with serverless compute running dbt jobs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-11-2025 08:40 AM
We recently encountered the below issue, where a databricks job configured to run a dbt task on serverless compute and warehouse failed due to python dependency failure:
run failed with error message
Library installation failed: Library installation attempted on serverless compute and failed due to: Invalid wheel. Please check the wheel file. Error code: ERROR_INVALID_WHEEL, error message: Notebook environment installation failed:
Collecting dbt-databricks<2.0.0,>=1.0.0
Downloading dbt_databricks-1.9.7-py3-none-any.whl (98 kB)
<U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501><U+2501> 98.5/98.5 kB 1.6 MB/s eta 0:00:00
Requirement already satisfied: pydantic>=1.10.0 in /databricks/python3/lib/python3.10/site-packages (from dbt-databricks<2.0.0,>=1.0.0->-r /tmp/tmp-6ee059b7563e4c56ae6581c61776c7b8-environment-requirements ...
***WARNING: message truncated. Skipped 7541 bytes of output**
This ran fine when retried, We want to know the root cause of such failures and why they would occur randomly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-15-2025 08:52 PM
Hi,
How are you doing today?, As per my understanding, This kind of random failure is usually due to network issues, temporary package repository problems, or how serverless compute handles dependencies. Since serverless clusters are short-lived and spin up on demand, sometimes the environment doesn’t initialize properly, or package downloads get interrupted. If multiple jobs are installing libraries at the same time, that could also cause conflicts. Since your job ran fine when retried, it was likely just a temporary glitch. To avoid this in the future, you could try pre-installing dependencies in a custom environment instead of installing them at runtime. Adding a simple retry mechanism in dbt could also help. Let me know if you want help setting that up!
Regards,
Brahma

