Deploying Databricks Asset Bundle Artifacts to Unity Catalog Volumes
Use Databricks Asset Bundle with a deployment job that leverages shell commands to copy artifacts from workspace bundle paths to Unity Catalog Volumes.
Configuration
databricks.yml
bundle:
name: artifact-deployer
targets:
dev:
workspace:
host: "https://your-workspace.cloud.databricks.com"
variables:
catalog: "your_catalog"
schema: "your_schema" volume: "artifacts"
resources:
jobs:
deploy_artifacts:
name: "Deploy Artifacts to Volume"
tasks:
- task_key: "copy_to_volume"
notebook_task:
notebook_path: "./notebooks/deploy"
base_parameters:
catalog: ${var.catalog}
schema: ${var.schema}
volume: ${var.volume}
Deployment Notebook (notebooks/deploy.ipynb)
Cell 1: Setup Parameters
# Create widgets for parameters
dbutils.widgets.text("catalog", "")
dbutils.widgets.text("schema", "") dbutils.widgets.text("volume", "")
# Get parameter values
catalog_name = dbutils.widgets.get("catalog")
schema_name = dbutils.widgets.get("schema")
volume_name = dbutils.widgets.get("volume")
print(f"Target: {catalog_name}.{schema_name}.{volume_name}")
Cell 2: Create Volume and Directory Structure
# Create Unity Catalog Volume
spark.sql(f"CREATE VOLUME IF NOT EXISTS {catalog_name}.{schema_name}.{volume_name}")
# Create directory structure
dbutils.fs.mkdirs(f"/Volumes/{catalog_name}/{schema_name}/{volume_name}/wheels/")
dbutils.fs.mkdirs(f"/Volumes/{catalog_name}/{schema_name}/{volume_name}/notebooks/")
print("Volume and directories created successfully")
Cell 3: Define Paths and Environment Variables
# Get current user and define paths
username = dbutils.notebook.entry_point.getDbutils().notebook().getContext().userName().get()
bundle_path = f"/Workspace/Users/{username}/.bundle/your_repo_name/dev/files"
volume_path = f"/Volumes/{catalog_name}/{schema_name}/{volume_name}"
# Set environment variables for shell access
import osos.environ['BUNDLE_PATH'] = bundle_pathos.environ['VOLUME_PATH'] = volume_path
print(f"Bundle Path: {bundle_path}")
print(f"Volume Path: {volume_path}")
Cell 4: Copy Artifacts Using Shell Commands
%%sh
echo "Copying from: $BUNDLE_PATH"
echo "Copying to: $VOLUME_PATH"
# Copy notebooks
if [ -d "$BUNDLE_PATH/notebooks" ]; then
cp -r "$BUNDLE_PATH/notebooks/"* "$VOLUME_PATH/notebooks/"
echo "Notebooks copied successfully"
else
echo "Notebooks directory not found at $BUNDLE_PATH/notebooks"
fi
# Copy Python wheels
if [ -d "$BUNDLE_PATH/dist" ]; then
cp "$BUNDLE_PATH/dist/"*.whl "$VOLUME_PATH/wheels/" 2>/dev/null && \
echo "Python wheels copied successfully" || \
echo "No wheel files found in $BUNDLE_PATH/dist"
else
echo "Dist directory not found at $BUNDLE_PATH/dist"
fi
# Verify deployment
echo ""
echo "Deployment Summary:"
echo "Notebooks in volume:"
find "$VOLUME_PATH/notebooks/" -type f 2>/dev/null | wc -l || echo "0"
echo "Wheels in volume:" find "$VOLUME_PATH/wheels/" -type f 2>/dev/null | wc -l || echo "0"
Key Technical Points
- Parameter Flow: databricks.yml variables → base_parameters → notebook widgets → Python variables
- Path Access: Bundle artifacts at /Workspace/Users/{user}/.bundle/{repo}/dev/files/ are accessible via shell but not dbutils.fs
- Environment Bridge: os.environ passes Python variables to shell commands
- Volume Paths: Unity Catalog Volumes accessible at /dbfs/Volumes/{catalog}/{schema}/{volume}/
You can replace whl with JAR file. Reason for not using programatically is the shell commands can access /Workspace paths directly, while Python file operations and dbutils.fs cannot. We still need to tweak the shell script a bit to fetch latest whl version which is an enhancement, works for now! And, yeah, this simple job can be run on serverless, so the deployment is instant (< 2 min) and doesnt need 4-5 min cluster bootup time to wait, so no need to use job cluster here.
Chanukya