Managing Databricks infrastructure at scale can quickly become a complex, time-consuming effort, especially when deploying hundreds of workspaces with unique configurations across teams and environments.
The challenge? Replicating environments securely, consistently, and efficiently—without the overhead of constantly tweaking Terraform modules or manually managing infrastructure components.
Here’s how teams can scale infrastructure deployments, boost productivity, and ensure governance—all while freeing up time to focus on delivering business value by leveraging the existing Databricks Terraform modules and Security Reference Architecture templates underneath the implementation.
Organizations face recurring pain points when scaling Databricks environments on Azure:
Spinning up a new Databricks workspace shouldn’t feel like a fire drill—but for many teams, it still does. Manual setups, inconsistent configs, and security concerns slow down delivery and drain resources before any data pipeline even runs.
Infrastructure teams often spend hours writing custom scripts to get customers started, an effort that doesn't scale.
Developers are stuck waiting on infrastructure, losing valuable time they could spend building.
DevOps teams carry the burden of enforcing standards and security, often with manual, error-prone processes.
Security teams worry about shared credentials and unclear separation of responsibilities.
The result? Frustration across the board—and a slower path to value.
The project will address these teams’ pain points, enabling faster innovation, greater productivity, and improved collaboration across teams.
To address these challenges, organizations must shift to a configuration-driven infrastructure model that removes complexity while enabling rapid, repeatable deployments.
Flow Diagram
Here’s what that looks like in action:
Use configuration files, not custom Terraform scripts for repeatable workspace creation, including support for public and private link architectures on Azure.
Example config: The below configuration creates a sandbox and a simplified private link to Databricks workspaces
{
"config": [
{
"name": "aira_adb_sandbox_workspace",
"create_resource_group": true,
"region": "eastus",
"rg_name": "aira-adb-sandbox-rg",
"tags": {
"environment": "sandbox",
"department": "AIRA",
"created-by": "arun.wagle@databricks.com",
"project-use-case": "Databricks for POC work",
"create-date": "20250227"
},
"type": "az_adb_public"
},
{
"name": "aira_adb_pl_simplified_workspace",
"is_auth_workspace": false,
"create_private_dns_zone": true,
"vnet_name": "aira-adb-eastus-mldevstage-vnet",
"network_rg_name": "aira-adb-mldevstage-rg",
"sg_name": "aira_eastus_databricks_mldevstage_internal_nsg",
"transit_public_subnet_name": "databricks_aira_mldevstage_external",
"transit_private_subnet_name": "databricks_aira_mldevstage_internal",
"transit_pl_subnet_name": "databricks_aira_mldevstage_privatelink",
"rg_name": "aira-adb-mldevstage-rg",
"private_endpoint_sub_resource_name": "databricks_ui_api",
"region": "eastus",
"tags": {
"environment": "dev",
"department": "AIRA",
"created-by": "arun.wagle@databricks.com",
"project-use-case": "Databricks for MLDevStage",
"create-date": "20250227"
},
"type": "az_adb_pl_simplified"
},
{
"name": "aira_adb_web_auth_DND_workspace_eastus",
"is_auth_workspace": true,
"vnet_name": "aira-adb-eastus-mldevstage-vnet",
"network_rg_name": "aira-adb-mldevstage-rg",
"sg_name": "aira_eastus_databricks_mldevstage_internal_nsg",
"transit_public_subnet_name": "databricks_aira_mldevstage_external",
"transit_private_subnet_name": "databricks_aira_mldevstage_internal",
"transit_pl_subnet_name": "databricks_aira_mldevstage_privatelink",
"rg_name": "aira-adb-mldevstage-rg",
"private_endpoint_sub_resource_name": "browser_authentication",
"region": "eastus",
"tags": {
"environment": "dev",
"department": "AIRA",
"created-by": "arun.wagle@databricks.com",
"project-use-case": "Webauth Databricks Workspace for MLDevStage",
"create-date": "20250227"
},
"type": "az_adb_pl_simplified"
}
]
}
Integrate with Azure Pipelines and Azure Repos to automate infrastructure updates, manage state, and ensure consistency across environments. This can be extended to any other repos of choice, like GitHub, Bitbucket, etc., and integrated with any CI/CD solutions, like Jenkins, Github actions, etc. The CI/CD configuration will be configuration-driven.
Example config: The below CI/CD configuration allows users to specify input platforms, repos, key vaults configs and optional components to be created.
{
"input_cloud": "azure",
"input_cicd_platform": "azure_devops",
"input_repo": "azure_repos",
"az_kv_name": "kv-terraform-vault",
"create_az_rg": "No",
"resource_cfg_nm_az_rg": "cfg-rg-1.json",
"module_nm_az_rg": "setup-az-create-rg",
"operation_az_rg": "apply",
"create_az_vnets": "No",
"resource_cfg_nm_az_vnets": "cfg-vnets-1.json",
"module_nm_az_vnets": "setup-az-create-vnets",
"operation_az_vnets": "apply",
"create_az_subnets": "No",
"resource_cfg_nm_az_subnets": "cfg-subnets-1.json",
"module_nm_az_subnets": "setup-az-create-subnets",
"operation_az_subnets": "apply",
"create_adb_workspaces": "Yes",
"resource_cfg_nm_adb_workspaces": "cfg-all-workspaces-1.json",
"module_nm_adb_workspaces": "setup-db-all-workspaces",
"operation_adb_workspaces": "apply"
}
Deploy resources flexibly and at scale, whether all at once or component by component. Provision VNETs, subnets, resource groups, and Databricks workspaces seamlessly.
The below sample project structure offers a modular approach, allowing you to create all resources simultaneously or build them out component by component.
Manage Terraform state using Azure Storage and Terraform workspaces, reducing manual intervention and the risk of misconfiguration. You can also leverage other solutions, like Terraform Cloud, for Terraform state management.
Example script(.sh file): This is integrated with the CI/CD process.
init_terraform(){
# Check required parameters
if [[ -z "$MODULE_NAME" || -z "$RESOURCE_CONFIG_FILE_NAME" || -z "$TERRAFORM_STATE_FILE_NAME" ]]; then
echo "Error: Missing required arguments. Usage: run_terraform <module_folder> <config_file> <state_file>"
exit 1
fi
# Change to the Terraform module directory
cd "targets/$MODULE_NAME" || { echo "Error: Directory $MODULE_NAME not found!"; exit 1; }
# Terraform Init (Non-interactive)
# if [[ "$RUN_TERRAFORM_INIT" == "Yes" ]]; then
echo "Initializing Terraform..."
terraform init \
-backend-config="storage_account_name=$TERRAFORM_STORAGE_ACCT_NAME" \
-backend-config="container_name=$TERRAFORM_CONTAINER_NAME" \
-backend-config="resource_group_name=$TERRAFORM_STATE_RESOURCE_GROUP" \
-backend-config="key=$TERRAFORM_STATE_FILE_NAME" || { echo "Error: Terraform initialization failed."; exit 1; }
}
run_terraform() {
# Check required parameters
if [[ -z "$MODULE_NAME" || -z "$RESOURCE_CONFIG_FILE_NAME" || -z "$TERRAFORM_STATE_FILE_NAME" ]]; then
echo "Error: Missing required arguments. Usage: run_terraform <module_folder> <config_file> <state_file>"
exit 1
fi
# Change to the Terraform module directory
cd "targets/$MODULE_NAME" || { echo "Error: Directory $MODULE_NAME not found!"; exit 1; }
# Terraform Workspace Setup (Non-interactive)
# local workspace_name=${$TERRAFORM_WORKSPACE_NAME:-${RESOURCE_CONFIG_FILE_NAME%.json}}
local workspace_name="${RESOURCE_CONFIG_FILE_NAME%.json}"
echo "Selecting existing Terraform workspace: $workspace_name"
terraform workspace select -or-create "$workspace_name" || { echo "Error: Failed to select workspace $workspace_name"; exit 1; }
# Terraform Operations (Controlled via ENV VARS)
echo "TERRAFORM_OPERATION: $TERRAFORM_OPERATION"
case "$TERRAFORM_OPERATION" in
plan) terraform_command="terraform plan -input=false" ;;
apply) terraform_command="terraform apply -input=false -auto-approve" ;;
destroy) terraform_command="terraform destroy -input=false -auto-approve" ;;
*) echo "Invalid Terraform operation: $TERRAFORM_OPERATION"; exit 1 ;;
esac
# Execute Terraform operation
echo "Executing: $terraform_command"
eval "$terraform_command" \
-var="config_file_name=$RESOURCE_CONFIG_FILE_NAME" \
-var="client_id=$TERRAFORM_SP" \
-var="client_secret=$TERRAFORM_SP_SECRET" \
-var="tenant_id=$AZURE_TENANT" \
-var="subscription_id=$AZURE_SUBSCRIPTION"
}
Implement Azure Key Vault for secrets management and ensure separation of concerns—DevOps teams handle credentials without exposing them to developers.
Deploy tools like the Databricks Security Analysis Tool (SAT) to monitor environments and ensure governance and compliance with security standards.
The Security Analysis Tool (SAT) is a Databricks industry solution that analyzes customers' Databricks account and workspace security configurations and provides recommendations that help them follow Databricks's security best practices. When customers run SAT, it compares their workspace configurations against a set of security best practices and delivers a report for their Databricks (AWS, Azure, and GCP) workspaces. These checks identify recommendations to harden Databricks configurations, services, and resources.
Figure: Referenced from blog-announcing-security-analysis-tool-sat
Once deployed, the SAT Dashboard displays security scan results for each workspace, sorted by severity.
Figure: Referenced from blog-announcing-security-analysis-tool-sat
Scaling Databricks environments doesn’t have to mean more complexity. With a configuration-driven, automated approach, your teams can move faster, stay secure, and scale efficiently.
Next steps:
Start today—simplify your infrastructure, accelerate your deployments, and empower your teams to focus on what matters most: delivering data-driven value.
Need help getting started or want to explore implementation templates? Let’s connect.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.