Hi @naveenbandla,
This is a common decision point when adopting Databricks Asset Bundles (DABs), and the answer depends on how closely coupled your team's work is. Here is a breakdown of the two main patterns and when each works best.
OPTION 1: SINGLE REPO, SINGLE BUNDLE (MONOLITH)
Use one databricks.yml at the root with all resources defined (or split across included files).
When it works well:
- Your team of 10 shares a common domain (e.g., one data platform team)
- Resources have cross-dependencies (e.g., jobs that reference shared pipelines or libraries)
- You want a single deployment artifact per environment
A typical folder structure looks like:
my-project/
databricks.yml
resources/
jobs/
ingest_job.yml
transform_job.yml
pipelines/
bronze_pipeline.yml
silver_pipeline.yml
src/
notebooks/
ingest.py
transform.py
python/
shared_utils/
__init__.py
helpers.py
tests/
unit/
integration/
Key points:
- Use the "include" mapping in databricks.yml to split resource definitions across multiple YAML files so the root file stays clean:
bundle:
name: my-project
include:
- "resources/jobs/*.yml"
- "resources/pipelines/*.yml"
targets:
dev:
mode: development
default: true
workspace:
host: https://your-dev-workspace.cloud.databricks.com
staging:
workspace:
host: https://your-staging-workspace.cloud.databricks.com
prod:
mode: production
workspace:
host: https://your-prod-workspace.cloud.databricks.com
run_as:
service_principal_name: "cicd-service-principal"
- In "development" mode, DABs automatically prefixes all deployed resources with [dev <your_username>], so all 10 team members can deploy simultaneously without naming collisions.
- You can deploy selectively with "databricks bundle deploy -t dev -r my_specific_job" to avoid deploying everything on each change.
OPTION 2: SINGLE REPO, MULTIPLE BUNDLES (DOMAIN/PROJECT SPLIT)
Each project or domain gets its own subdirectory with its own databricks.yml. This is the recommended approach when teams or projects are more independent.
repo-root/
project-a/
databricks.yml
src/
resources/
tests/
project-b/
databricks.yml
src/
resources/
tests/
shared-libs/
python/
common_utils/
When it works well:
- Different team members own different projects or domains
- You want changes scoped to only the affected project (faster deploys, smaller blast radius)
- Projects have different deployment cadences or target different workspaces
In your CI/CD pipeline (GitHub Actions, Azure DevOps, etc.), you can detect which subdirectory changed and only deploy that bundle:
# GitHub Actions example (simplified)
jobs:
deploy:
steps:
- uses: actions/checkout@v4
- uses: dorny/paths-filter@v3
id: changes
with:
filters: |
project-a:
- 'project-a/**'
project-b:
- 'project-b/**'
- if: steps.changes.outputs.project-a == 'true'
run: |
cd project-a
databricks bundle deploy -t prod
RECOMMENDATION FOR A TEAM OF 10
For most teams of this size, a hybrid approach works well:
1. Start with a single bundle if the team shares one domain. The "include" feature keeps things modular, and dev mode prevents conflicts.
2. Split into separate bundles per project when you notice that unrelated changes are triggering full redeployments, or when sub-teams form around distinct workloads.
3. Use custom bundle templates to standardize folder structure across all projects. You can create a template and have every team member initialize new projects from it:
databricks bundle init /path/to/your/team-template
This ensures consistent naming, testing structure, and CI/CD configuration across all 10 members.
ADDITIONAL BEST PRACTICES
- Use service principals for production deployments. Never deploy to prod with personal credentials.
- Set "mode: production" on your prod target. This enforces validations like requiring run_as to be set and disabling cluster overrides.
- Use Git branch validation in your prod target to ensure only the main branch can deploy to production.
- Keep shared Python libraries in a dedicated folder and reference them via the "libraries" mapping in your job definitions.
- Use "databricks bundle validate" in your CI pipeline as a pre-merge check to catch configuration errors early.
DOCUMENTATION REFERENCES
- Databricks Asset Bundles overview: https://docs.databricks.com/aws/en/dev-tools/bundles/
- Bundle configuration (databricks.yml): https://docs.databricks.com/aws/en/dev-tools/bundles/settings.html
- CI/CD with Databricks Asset Bundles: https://docs.databricks.com/aws/en/dev-tools/bundles/ci-cd.html
- Deployment modes (dev vs production): https://docs.databricks.com/aws/en/dev-tools/bundles/deployment-modes.html
- Custom bundle templates: https://docs.databricks.com/aws/en/dev-tools/bundles/templates.html
- GitHub Actions for Databricks: https://github.com/databricks/setup-cli
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.
If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.