cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Asset Bundle Include Glob paths not resolving recursive directories

DavidMoss
New Contributor

Hello,

When trying to include resource definitions in nested yaml files, the recursive paths I am specifying in the include section are not resolving as would be expected.

With the include path resources/**/*.yml and a directory structure structure as follows:
/
| databricks.yml
| resources
- | clusters
- - | cluster1.yml
- - | cluster2.yml
- | unity-catalog
- - | schemas
- - - | schema1.yml
- - - | schema2.yml

I would expect all .yml files to be included in the deployed template, but only those in resources/clusters are.  If I change the include to resources/**/**/*.yml then all the .yml files in resources/unity-catalog/schemas are included, but the clusters are not.  If I add both paths to the include, then all the nested .yml files are included within the deployed template, though this is obviously a less than ideal workaround.

Am I misunderstanding how these paths are supposed to be resolved or is this a bug in the databricks cli?

For reference I am using Databricks CLI v0.255.0 within a PowerShell terminal on Windows 10.

1 ACCEPTED SOLUTION

Accepted Solutions

mark_ott
Databricks Employee
Databricks Employee

This behavior is caused by the way the Databricks CLI currently handles recursive globbing for the include section in databricks.yml files. You are not misunderstanding; this is a limitation (and partially a bug) in how the CLI resolves glob patterns for included YAML files rather than a mistake in your configuration.

Explanation

According to the Databricks Asset Bundle configuration documentation, entries in include are processed using relative path globs that behave similarly to .gitignore patterns. However, the CLI does not support fully recursive multi-level expansion beyond one level of double-star (**) recursion in some contexts. Specifically:​

  • resources/**/*.yml expands only one directory deep under resources, meaning files like resources/clusters/cluster*.yml are correctly found.

  • When deeper folders exist (for example, resources/unity-catalog/schemas/schema1.yml), the ** pattern stops matching because the CLI performs only one recursive depth expansion during include resolution, not fully recursive searching.

  • Conversely, when you change the pattern to resources/**/**/*.yml, the traversal starts deeper, catching nested schemas but skipping the shallower directory, since it shifts the base recursion depth.

This behavior affects the include array in databricks.yml but not the sync.include or sync.paths mappings, which use standardized .gitignore-style globbing.

Confirmation of the Bug

A related issue was raised in Databricks CLI GitHub issue #1755, where similar path resolution and glob-root expansion inconsistencies were reported. The Databricks team acknowledged the issue and implemented a patch (#1756 – Expand library globs relative to the sync root) to improve consistent glob expansion, but as of version 0.227.x–0.228.x, this fix primarily affected the sync and resource path resolution, not include.​

Hence your observation—having to specify multiple patterns (e.g., both resources/**/*.yml and resources/**/**/*.yml)—is consistent with this incomplete glob-expansion behavior.

Workarounds

Until Databricks extends consistent glob handling to nested includes, you can:

  1. Combine multiple patterns in your include mapping:

    text
    include: - resources/**/*.yml - resources/**/**/*.yml
  2. Alternatively, flatten the resource structure or consolidate schema definitions where possible.

  3. Ensure your CLI is updated beyond v0.228.x once a fix for recursive include resolution is officially released.

Summary

You’re not misunderstanding the syntax; it’s a known, unresolved bug in the Databricks CLI’s recursive include implementation. As of CLI versions through 0.228.x, the recursive ** matches inconsistently depending on directory depth within the include block.

View solution in original post

2 REPLIES 2

mark_ott
Databricks Employee
Databricks Employee

This behavior is caused by the way the Databricks CLI currently handles recursive globbing for the include section in databricks.yml files. You are not misunderstanding; this is a limitation (and partially a bug) in how the CLI resolves glob patterns for included YAML files rather than a mistake in your configuration.

Explanation

According to the Databricks Asset Bundle configuration documentation, entries in include are processed using relative path globs that behave similarly to .gitignore patterns. However, the CLI does not support fully recursive multi-level expansion beyond one level of double-star (**) recursion in some contexts. Specifically:​

  • resources/**/*.yml expands only one directory deep under resources, meaning files like resources/clusters/cluster*.yml are correctly found.

  • When deeper folders exist (for example, resources/unity-catalog/schemas/schema1.yml), the ** pattern stops matching because the CLI performs only one recursive depth expansion during include resolution, not fully recursive searching.

  • Conversely, when you change the pattern to resources/**/**/*.yml, the traversal starts deeper, catching nested schemas but skipping the shallower directory, since it shifts the base recursion depth.

This behavior affects the include array in databricks.yml but not the sync.include or sync.paths mappings, which use standardized .gitignore-style globbing.

Confirmation of the Bug

A related issue was raised in Databricks CLI GitHub issue #1755, where similar path resolution and glob-root expansion inconsistencies were reported. The Databricks team acknowledged the issue and implemented a patch (#1756 – Expand library globs relative to the sync root) to improve consistent glob expansion, but as of version 0.227.x–0.228.x, this fix primarily affected the sync and resource path resolution, not include.​

Hence your observation—having to specify multiple patterns (e.g., both resources/**/*.yml and resources/**/**/*.yml)—is consistent with this incomplete glob-expansion behavior.

Workarounds

Until Databricks extends consistent glob handling to nested includes, you can:

  1. Combine multiple patterns in your include mapping:

    text
    include: - resources/**/*.yml - resources/**/**/*.yml
  2. Alternatively, flatten the resource structure or consolidate schema definitions where possible.

  3. Ensure your CLI is updated beyond v0.228.x once a fix for recursive include resolution is officially released.

Summary

You’re not misunderstanding the syntax; it’s a known, unresolved bug in the Databricks CLI’s recursive include implementation. As of CLI versions through 0.228.x, the recursive ** matches inconsistently depending on directory depth within the include block.

-werners-
Esteemed Contributor III

now THAT is a clear answer!

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now