<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic CICD Folder structure for team of 10 Members in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/cicd-folder-structure-for-team-of-10-members/m-p/143180#M52112</link>
    <description>&lt;P&gt;Hi Everyone,&lt;/P&gt;&lt;P&gt;We are in the process of setting up a CI/CD framework for our Databricks ecosystem, and I have a general question around best practices.&lt;/P&gt;&lt;P&gt;We are a team of 10 members, and I’m trying to understand the ideal way to structure our repository and Databricks assets. I’ve gone through several blog posts, but I’m seeing mixed approaches.&lt;/P&gt;&lt;P&gt;Specifically:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;Should we maintain a single top-level databricks.yml and deploy everything for every change?&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Or is it better to organize assets project-wise (or domain-wise), each with its own configuration, so changes are scoped only to the relevant project?&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I’d like to understand what is generally followed across companies and what has worked well in practice for scalability, collaboration, and controlled deployments.&lt;/P&gt;&lt;P&gt;Looking forward to your inputs and recommendations.&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Wed, 07 Jan 2026 13:15:27 GMT</pubDate>
    <dc:creator>naveenbandla</dc:creator>
    <dc:date>2026-01-07T13:15:27Z</dc:date>
    <item>
      <title>CICD Folder structure for team of 10 Members</title>
      <link>https://community.databricks.com/t5/data-engineering/cicd-folder-structure-for-team-of-10-members/m-p/143180#M52112</link>
      <description>&lt;P&gt;Hi Everyone,&lt;/P&gt;&lt;P&gt;We are in the process of setting up a CI/CD framework for our Databricks ecosystem, and I have a general question around best practices.&lt;/P&gt;&lt;P&gt;We are a team of 10 members, and I’m trying to understand the ideal way to structure our repository and Databricks assets. I’ve gone through several blog posts, but I’m seeing mixed approaches.&lt;/P&gt;&lt;P&gt;Specifically:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;Should we maintain a single top-level databricks.yml and deploy everything for every change?&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Or is it better to organize assets project-wise (or domain-wise), each with its own configuration, so changes are scoped only to the relevant project?&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I’d like to understand what is generally followed across companies and what has worked well in practice for scalability, collaboration, and controlled deployments.&lt;/P&gt;&lt;P&gt;Looking forward to your inputs and recommendations.&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 07 Jan 2026 13:15:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cicd-folder-structure-for-team-of-10-members/m-p/143180#M52112</guid>
      <dc:creator>naveenbandla</dc:creator>
      <dc:date>2026-01-07T13:15:27Z</dc:date>
    </item>
    <item>
      <title>Re: CICD Folder structure for team of 10 Members</title>
      <link>https://community.databricks.com/t5/data-engineering/cicd-folder-structure-for-team-of-10-members/m-p/143190#M52114</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV&gt;&lt;P class=""&gt;If the work is owned by the same team, you can use a single databricks.yml. Each team member develops and tests their own resource locally, then commits to Git. At deployment time, you can either deploy all resources (using a wildcard) or target only the specific resources that changed. In development mode, resource names are automatically prefixed with the deploying user’s ID, which prevents naming conflicts across teammates so its safe and simpler to use a single databricks.yml&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 07 Jan 2026 13:50:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cicd-folder-structure-for-team-of-10-members/m-p/143190#M52114</guid>
      <dc:creator>pradeep_singh</dc:creator>
      <dc:date>2026-01-07T13:50:38Z</dc:date>
    </item>
    <item>
      <title>Re: CICD Folder structure for team of 10 Members</title>
      <link>https://community.databricks.com/t5/data-engineering/cicd-folder-structure-for-team-of-10-members/m-p/150257#M53320</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/202775"&gt;@naveenbandla&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;This is a common decision point when adopting Databricks Asset Bundles (DABs), and the answer depends on how closely coupled your team's work is. Here is a breakdown of the two main patterns and when each works best.&lt;/P&gt;
&lt;P&gt;OPTION 1: SINGLE REPO, SINGLE BUNDLE (MONOLITH)&lt;/P&gt;
&lt;P&gt;Use one databricks.yml at the root with all resources defined (or split across included files).&lt;/P&gt;
&lt;P&gt;When it works well:&lt;BR /&gt;- Your team of 10 shares a common domain (e.g., one data platform team)&lt;BR /&gt;- Resources have cross-dependencies (e.g., jobs that reference shared pipelines or libraries)&lt;BR /&gt;- You want a single deployment artifact per environment&lt;/P&gt;
&lt;P&gt;A typical folder structure looks like:&lt;/P&gt;
&lt;PRE&gt;my-project/
databricks.yml
resources/
  jobs/
    ingest_job.yml
    transform_job.yml
  pipelines/
    bronze_pipeline.yml
    silver_pipeline.yml
src/
  notebooks/
    ingest.py
    transform.py
  python/
    shared_utils/
      __init__.py
      helpers.py
tests/
  unit/
  integration/&lt;/PRE&gt;
&lt;P&gt;Key points:&lt;BR /&gt;- Use the "include" mapping in databricks.yml to split resource definitions across multiple YAML files so the root file stays clean:&lt;/P&gt;
&lt;PRE&gt;bundle:
name: my-project

include:
- "resources/jobs/*.yml"
- "resources/pipelines/*.yml"

targets:
dev:
  mode: development
  default: true
  workspace:
    host: https://your-dev-workspace.cloud.databricks.com
staging:
  workspace:
    host: https://your-staging-workspace.cloud.databricks.com
prod:
  mode: production
  workspace:
    host: https://your-prod-workspace.cloud.databricks.com
  run_as:
    service_principal_name: "cicd-service-principal"&lt;/PRE&gt;
&lt;P&gt;- In "development" mode, DABs automatically prefixes all deployed resources with [dev &amp;lt;your_username&amp;gt;], so all 10 team members can deploy simultaneously without naming collisions.&lt;BR /&gt;- You can deploy selectively with "databricks bundle deploy -t dev -r my_specific_job" to avoid deploying everything on each change.&lt;/P&gt;
&lt;P&gt;OPTION 2: SINGLE REPO, MULTIPLE BUNDLES (DOMAIN/PROJECT SPLIT)&lt;/P&gt;
&lt;P&gt;Each project or domain gets its own subdirectory with its own databricks.yml. This is the recommended approach when teams or projects are more independent.&lt;/P&gt;
&lt;PRE&gt;repo-root/
project-a/
  databricks.yml
  src/
  resources/
  tests/
project-b/
  databricks.yml
  src/
  resources/
  tests/
shared-libs/
  python/
    common_utils/&lt;/PRE&gt;
&lt;P&gt;When it works well:&lt;BR /&gt;- Different team members own different projects or domains&lt;BR /&gt;- You want changes scoped to only the affected project (faster deploys, smaller blast radius)&lt;BR /&gt;- Projects have different deployment cadences or target different workspaces&lt;/P&gt;
&lt;P&gt;In your CI/CD pipeline (GitHub Actions, Azure DevOps, etc.), you can detect which subdirectory changed and only deploy that bundle:&lt;/P&gt;
&lt;PRE&gt;# GitHub Actions example (simplified)
jobs:
deploy:
  steps:
    - uses: actions/checkout@v4
    - uses: dorny/paths-filter@v3
      id: changes
      with:
        filters: |
          project-a:
            - 'project-a/**'
          project-b:
            - 'project-b/**'
    - if: steps.changes.outputs.project-a == 'true'
      run: |
        cd project-a
        databricks bundle deploy -t prod&lt;/PRE&gt;
&lt;P&gt;RECOMMENDATION FOR A TEAM OF 10&lt;/P&gt;
&lt;P&gt;For most teams of this size, a hybrid approach works well:&lt;/P&gt;
&lt;P&gt;1. Start with a single bundle if the team shares one domain. The "include" feature keeps things modular, and dev mode prevents conflicts.&lt;/P&gt;
&lt;P&gt;2. Split into separate bundles per project when you notice that unrelated changes are triggering full redeployments, or when sub-teams form around distinct workloads.&lt;/P&gt;
&lt;P&gt;3. Use custom bundle templates to standardize folder structure across all projects. You can create a template and have every team member initialize new projects from it:&lt;/P&gt;
&lt;PRE&gt;databricks bundle init /path/to/your/team-template&lt;/PRE&gt;
&lt;P&gt;This ensures consistent naming, testing structure, and CI/CD configuration across all 10 members.&lt;/P&gt;
&lt;P&gt;ADDITIONAL BEST PRACTICES&lt;/P&gt;
&lt;P&gt;- Use service principals for production deployments. Never deploy to prod with personal credentials.&lt;BR /&gt;- Set "mode: production" on your prod target. This enforces validations like requiring run_as to be set and disabling cluster overrides.&lt;BR /&gt;- Use Git branch validation in your prod target to ensure only the main branch can deploy to production.&lt;BR /&gt;- Keep shared Python libraries in a dedicated folder and reference them via the "libraries" mapping in your job definitions.&lt;BR /&gt;- Use "databricks bundle validate" in your CI pipeline as a pre-merge check to catch configuration errors early.&lt;/P&gt;
&lt;P&gt;DOCUMENTATION REFERENCES&lt;/P&gt;
&lt;P&gt;- Databricks Asset Bundles overview: &lt;A href="https://docs.databricks.com/aws/en/dev-tools/bundles/" target="_blank"&gt;https://docs.databricks.com/aws/en/dev-tools/bundles/&lt;/A&gt;&lt;BR /&gt;- Bundle configuration (databricks.yml): &lt;A href="https://docs.databricks.com/aws/en/dev-tools/bundles/settings.html" target="_blank"&gt;https://docs.databricks.com/aws/en/dev-tools/bundles/settings.html&lt;/A&gt;&lt;BR /&gt;- CI/CD with Databricks Asset Bundles: &lt;A href="https://docs.databricks.com/aws/en/dev-tools/bundles/ci-cd.html" target="_blank"&gt;https://docs.databricks.com/aws/en/dev-tools/bundles/ci-cd.html&lt;/A&gt;&lt;BR /&gt;- Deployment modes (dev vs production): &lt;A href="https://docs.databricks.com/aws/en/dev-tools/bundles/deployment-modes.html" target="_blank"&gt;https://docs.databricks.com/aws/en/dev-tools/bundles/deployment-modes.html&lt;/A&gt;&lt;BR /&gt;- Custom bundle templates: &lt;A href="https://docs.databricks.com/aws/en/dev-tools/bundles/templates.html" target="_blank"&gt;https://docs.databricks.com/aws/en/dev-tools/bundles/templates.html&lt;/A&gt;&lt;BR /&gt;- GitHub Actions for Databricks: &lt;A href="https://github.com/databricks/setup-cli" target="_blank"&gt;https://github.com/databricks/setup-cli&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.&lt;/P&gt;
&lt;P&gt;If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.&lt;/P&gt;</description>
      <pubDate>Sun, 08 Mar 2026 20:56:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cicd-folder-structure-for-team-of-10-members/m-p/150257#M53320</guid>
      <dc:creator>SteveOstrowski</dc:creator>
      <dc:date>2026-03-08T20:56:53Z</dc:date>
    </item>
  </channel>
</rss>

