Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-15-2025 05:37 AM
Deploying Databricks Asset Bundle Artifacts to Unity Catalog Volumes
Use Databricks Asset Bundle with a deployment job that leverages shell commands to copy artifacts from workspace bundle paths to Unity Catalog Volumes.
Configuration
databricks.yml
bundle: name: artifact-deployer targets: dev: workspace: host: "https://your-workspace.cloud.databricks.com" variables: catalog: "your_catalog" schema: "your_schema" volume: "artifacts" resources: jobs: deploy_artifacts: name: "Deploy Artifacts to Volume" tasks: - task_key: "copy_to_volume" notebook_task: notebook_path: "./notebooks/deploy" base_parameters: catalog: ${var.catalog} schema: ${var.schema} volume: ${var.volume}
Deployment Notebook (notebooks/deploy.ipynb)
Cell 1: Setup Parameters
# Create widgets for parameters dbutils.widgets.text("catalog", "") dbutils.widgets.text("schema", "") dbutils.widgets.text("volume", "") # Get parameter values catalog_name = dbutils.widgets.get("catalog") schema_name = dbutils.widgets.get("schema") volume_name = dbutils.widgets.get("volume") print(f"Target: {catalog_name}.{schema_name}.{volume_name}")
Cell 2: Create Volume and Directory Structure
# Create Unity Catalog Volume spark.sql(f"CREATE VOLUME IF NOT EXISTS {catalog_name}.{schema_name}.{volume_name}") # Create directory structure dbutils.fs.mkdirs(f"/Volumes/{catalog_name}/{schema_name}/{volume_name}/wheels/") dbutils.fs.mkdirs(f"/Volumes/{catalog_name}/{schema_name}/{volume_name}/notebooks/") print("Volume and directories created successfully")
Cell 3: Define Paths and Environment Variables
# Get current user and define paths username = dbutils.notebook.entry_point.getDbutils().notebook().getContext().userName().get() bundle_path = f"/Workspace/Users/{username}/.bundle/your_repo_name/dev/files" volume_path = f"/Volumes/{catalog_name}/{schema_name}/{volume_name}" # Set environment variables for shell access import osos.environ['BUNDLE_PATH'] = bundle_pathos.environ['VOLUME_PATH'] = volume_path print(f"Bundle Path: {bundle_path}") print(f"Volume Path: {volume_path}")
Cell 4: Copy Artifacts Using Shell Commands
%%sh echo "Copying from: $BUNDLE_PATH" echo "Copying to: $VOLUME_PATH" # Copy notebooks if [ -d "$BUNDLE_PATH/notebooks" ]; then cp -r "$BUNDLE_PATH/notebooks/"* "$VOLUME_PATH/notebooks/" echo "Notebooks copied successfully" else echo "Notebooks directory not found at $BUNDLE_PATH/notebooks" fi # Copy Python wheels if [ -d "$BUNDLE_PATH/dist" ]; then cp "$BUNDLE_PATH/dist/"*.whl "$VOLUME_PATH/wheels/" 2>/dev/null && \ echo "Python wheels copied successfully" || \ echo "No wheel files found in $BUNDLE_PATH/dist" else echo "Dist directory not found at $BUNDLE_PATH/dist" fi # Verify deployment echo "" echo "Deployment Summary:" echo "Notebooks in volume:" find "$VOLUME_PATH/notebooks/" -type f 2>/dev/null | wc -l || echo "0" echo "Wheels in volume:" find "$VOLUME_PATH/wheels/" -type f 2>/dev/null | wc -l || echo "0"
Key Technical Points
- Parameter Flow: databricks.yml variables → base_parameters → notebook widgets → Python variables
- Path Access: Bundle artifacts at /Workspace/Users/{user}/.bundle/{repo}/dev/files/ are accessible via shell but not dbutils.fs
- Environment Bridge: os.environ passes Python variables to shell commands
- Volume Paths: Unity Catalog Volumes accessible at /dbfs/Volumes/{catalog}/{schema}/{volume}/
You can replace whl with JAR file. Reason for not using programatically is the shell commands can access /Workspace paths directly, while Python file operations and dbutils.fs cannot. We still need to tweak the shell script a bit to fetch latest whl version which is an enhancement, works for now! And, yeah, this simple job can be run on serverless, so the deployment is instant (< 2 min) and doesnt need 4-5 min cluster bootup time to wait, so no need to use job cluster here.
Chanukya