yesterday
I would like to understand the differences between Terraform and Asset Bundles, especially since in some cases, they can do the same thing. Iโm not talking about provisioning storage, networking, or the Databricks workspace itselfโI know that is Terraformโs role. However, since Bundles started supporting functions like creating all-purpose compute, secrets, databases, schemas, and apps, these are all things Terraform can do as well. Iโve seen Asset Bundles excel at creating workflows, but now Iโm unsure when to use Bundles and when to use Terraform, given that they share similar capabilities. Are there best practices? For example: should pipelines be handled by Bundles, while clusters, secrets, and infrastructure pieces are left to Terraform? Should I use them together? I would love to hear some opinions.
9 hours ago
You can have a different repository with Databricks CLI scripts and/or Terraform IaC code to specific task such as assigning permissions, etc. that you do not want to share with developers. In the end you can access both of them in CI/CD pipelines to run those scripts in the order you need. So, you should include in DAB everything supported that meets your security (or other) requirements and in the other repo, those privileged scripts needed to apply along with DAB. Take into account that, in the end, DAB is only a subset of commands performed via CLI. You can manage via tasks in CI/CD.
yesterday
First, DAB uses terraform in the background. Having said that, my recommendation is to use DAB for whatever component already included and only other tools for IaC not supported yet or non-databricks specific (private VNets, external storages, etc.) This is what I'm using in real-life applications with Databricks.
9 hours ago
Thanks for answering!
I really like this approach, but how do I manage when I can only develop in the repository, and when I can make changes in the bundle? Let's suppose that I'm creating a schema and granting permissions with the bundle. As the bundle is kept in the same repository as the code, the developers can see and change the file. This could be blocked with some code review and no privileges to merge the code in branches that run the CI/CD with a service principal, but it seems that it becomes easier to bypass the security than having a separate Terraform repository, with only people that have admin rights on the platform and know Terraform. On the other hand, this will make the development more bureaucratic. What do you think?
9 hours ago
You can have a different repository with Databricks CLI scripts and/or Terraform IaC code to specific task such as assigning permissions, etc. that you do not want to share with developers. In the end you can access both of them in CI/CD pipelines to run those scripts in the order you need. So, you should include in DAB everything supported that meets your security (or other) requirements and in the other repo, those privileged scripts needed to apply along with DAB. Take into account that, in the end, DAB is only a subset of commands performed via CLI. You can manage via tasks in CI/CD.
8 hours ago
That is a very clear explanation; I understand it well now. Bundles seem better for simplicity and a repo-oriented workflow, making it easy to manage multiple repos where everyone creates their own Databricks resources. However, when I need stricter security, it makes sense to use a separate repository to deploy those sensitive resources. In that case, I think Terraform works very well, especially if the foundational infrastructure (like networks, buckets, etc.) was already created with it.
I appreciate your time ๐
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now