Re: Install python packages on serverless compute ...

mark_ott · ‎10-29-2025

Installing Python packages on Databricks serverless compute via asset bundles is possible, but there are some unique limitations and required configuration adjustments compared to traditional jobs or job tasks. The core methods to install packages for serverless workloads involve either asset bundles’ environment sections or using Python wheel files for dependencies.

Key Findings

Asset Bundles and Environments: To add third-party libraries to DLT serverless pipelines, you must use the environments section within your asset bundle definition. However, simply specifying the dependencies in the environment block isn’t enough; you need to explicitly reference the environment in the task itself. Without this reference, your custom or external packages are not correctly installed at runtime.
Linking Environment to Task: The environment key defined under environments must be linked in your pipeline/job task using the environment_key. This ensures your pipeline attempts to pull in the dependencies you listed.
Supported Package Types: Installing packages via asset bundles is most predictable when you package dependencies as Python wheel files (.whl) and list them in the environment’s dependencies property. For pip/conda-style installations, support may vary, and pip-installing directly from PyPI within the configuration may not always work as seamlessly on serverless compute compared to standard clusters.
Manual Install Still Works: You can still install packages at runtime in notebooks using %pip install ..., but this defeats full automation and reproducibility via asset bundles.
Limitations: JAR/Maven packages and direct custom data source connections are not supported on serverless; support is Python-centric.

Alternative (Wheel Packaging)

If you have more complex dependencies or custom code, pre-package your dependencies (or your code and dependencies) as a wheel file and reference them in your bundle, which is well-supported and robust:

text

environments:
  - environment_key: myenv
    spec:
      dependencies:
        - dist/my_package-0.1.0-py3-none-any.whl

# Reference the environment_key in the task as shown above.

Summary Table

Installation Approach	Works on Serverless?	Notes
`pip` in notebook	Yes	Manual, not reproducible
Asset bundle, env not linked	No	Must link environment_key
Asset bundle with wheel file	Yes	Best for custom code
Asset bundle w/ PyPI in env	Yes (if linked)	Use `dependencies` block
JAR/Maven dependencies	No	Not supported

For best results, package dependencies in a wheel, reference it in your bundle environment, and always link your environment_key in your job/task definition. If your use case is still not supported, consider manual %pip install in a notebook or check for any new Databricks documentation regarding serverless package management.

Key Findings

Recommended Solution

Alternative (Wheel Packaging)

Summary Table