cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Terraform - how to manage databricks entirely through Terraform?

Jfoxyyc
Valued Contributor

I'm stuck at a point where I can't automatically set up everything about a databricks environment due to the fact that service principals can't be made an admin at the account level (accounts.azuredatabricks.net, similar for aws). Going into a bare tenant, with no databricks previously, I need to:

  • Set up Terraform, launch a module to create a resource group, workspace, couple of storage accounts
  • Manually log into the accounts portal as a global admin of the azure tenant
  • Set up an admin user group and pass them admin
  • Manually authenticate with my non service principal Terraform user to set up Unity metastore and attach the workspace
  • Continue on with the Terraform workflow

Not having the ability to set up a service principal as an account admin, or give a service principal the ability to create metastores and other account admin functions really messes with the IAC workflow.

Thoughts?

2 REPLIES 2

daniel_sahal
Esteemed Contributor

Unfortunately there are still some limitations with doing IaC on Databricks with Terraform (ex. another one is that you can't setup KeyVault as a secret store with Service Principal).

I think that instead of doing stuff manually, you can authenticate through azure-cli and run TF scripts within your account context.

It's not perfect though....

Right. The idea would be to never run Terraform outside of the context of a CICD pipeline, in which case the pipeline would be authenticating using a service principal, not azure-cli. The awkwardness is furthered by how Terraform manages state, it would be difficult to set everything up using a non-CICD module and a backend, and then using that same backend but different module on the CICD pipeline.

My goal is to be able to create a resource group, create some service principals, make them owner of the resource group, and then they're able to set everything Databricks needs up through Terraform. This doesn't seem to be possible right now. I'd love to work with the Databricks Terraform provider group and address these pain points.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group