Hi Community,
We are working on implementing Databricks cluster policies across our organization and are seeking advice on best practices to enforce governance, security, and cost control across different environments.
We have two main teams using Databricks across multiple environments:
Data Engineering – Dev / QA / Prod
Data & Analytics – Dev / QA / Prod
Each environment has a separate Databricks workspace. Our goal is to define robust cluster policies that:
Enforce configuration standards (e.g., disallow public IPs, enforce autoscaling, fixed Spark configs)
Control costs (e.g., limit max workers/memory in dev/QA)
Ensure production stability (e.g., disallow in it scripts or spot instances in prod)
Allow safe experimentation in dev while keeping strong guardrails
Trying to decide:
Should we define one policy per team per environment (e.g., data-engineering, analytics) or have general reusable policies for each environment type?
What are common policy restrictions used in Dev/QA vs. Prod?
(e.g., disallowing public IPs, enforcing autoscaling, limiting worker sizes, etc.)
Are there any example templates or reusable patterns followed in other large organizations?
Any tips for balancing developer flexibility with platform governance?
- Please differentiate between data engineers and data analytics across all environments and provide the code for it.
We appreciate any advice, templates, or governance experiences you can share!
Thanks in advance!