ā11-05-2022 10:19 AM
I have been following the documentation on the terraform databricks documentation in order to provision account level resources on AWS. I can create the workspace fine, add users, etc... However, when I go to use the provider in non-mws mode, I am receiving errors saying:
Error: workspace is most likely not created yet, because the `host` is empty. Please add `depends_on = [databricks_mws_workspaces.this]` or `depends_on = [azurerm_databricks_workspace.this]` to every data resource. See https://www.terraform.io/docs/language/resources/behavior.html more info. Please check https://registry.terraform.io/providers/databricks/databricks/latest/docs#authentication for details
ā
ā with module.workspaces.data.databricks_spark_version.latest,
ā on ../modules/aws_workspaces/init.tf line 11, in data "databricks_spark_version" "latest":
ā 11: data "databricks_spark_version" "latest" {}
To show how this created I have a file called root.tf which creates the root mws level resources absolutely fine.
provider "databricks" {
alias = "mws"
host = "https://accounts.cloud.databricks.com"
username = var.databricks_account_username
password = var.databricks_account_password
}
module "root" {
source = "../modules/aws_root"
databricks_account_id = var.databricks_account_id
tags = var.tags
region = var.region
cidr_block = var.cidr_block
databricks_users = var.databricks_users
databricks_metastore_admins = var.databricks_metastore_admins
unity_admin_group = var.unity_admin_group
providers = {
databricks.mws = databricks.mws
}
}
With outputs coming from that module:
output "databricks_host" {
value = databricks_mws_workspaces.this.workspace_url
}
output "databricks_token" {
value = databricks_mws_workspaces.this.token[0].token_value
sensitive = true
}
output "databricks_workspace_id" {
value = databricks_mws_workspaces.this.workspace_id
sensitive = false
}
output "databricks_account_id" {
value = databricks_mws_workspaces.this.account_id
sensitive = true
}
output "aws_iam_role_metastore_data_access_arn" {
value = aws_iam_role.metastore_data_access.arn
}
output "aws_iam_role_metastore_data_access_name" {
value = aws_iam_role.metastore_data_access.name
}
output "aws_s3_bucket_metastore_id" {
value = aws_s3_bucket.metastore.id
}
These can be seen in the created resources when I do a
`terraform state show <outputs>`
However, when I go to create a workspace level provider to create some notebooks, clusters, etc... I seem to be unable to get the child module resources to use the newly created provider with a host, even though it is being set and I can see it's value. Even hard coding the host does not work. They all output the above error.
The creation of this provider and module can be seen here:
provider "databricks" {
alias = "workspace"
host = module.root.databricks_host
token = module.root.databricks_token
account_id = module.root.databricks_account_id
}
module "workspaces" {
source = "../modules/aws_workspaces"
aws_s3_bucket_metastore_id = module.root.aws_s3_bucket_metastore_id
aws_iam_role_metastore_data_access_arn = module.root.aws_iam_role_metastore_data_access_arn
aws_iam_role_metastore_data_access_name = module.root.aws_iam_role_metastore_data_access_name
cidr_block = var.cidr_block
databricks_account_id = var.databricks_account_id
databricks_bronze_users = var.databricks_bronze_users
databricks_gold_users = var.databricks_gold_users
databricks_host = module.root.databricks_host
databricks_metastore_admins = var.databricks_metastore_admins
databricks_silver_users = var.databricks_silver_users
databricks_token = module.root.databricks_token
databricks_users = var.databricks_users
databricks_workspace_id = module.root.databricks_workspace_id
python_module_version_number = local.python_module_version_number
shed_databricks_egg_name = var.shed_databricks_egg_name
tags = var.tags
unity_admin_group = var.unity_admin_group
config = local.config
depends_on = [
module.root
]
providers = {
databricks = databricks.workspace
}
}
And the init method that uses this provider and is throwing the error can be seen here:
terraform {
required_providers {
databricks = {
source = "databricks/databricks"
version = "~> 1.6.2"
configuration_aliases = [databricks.workspace]
}
}
}
data "databricks_spark_version" "latest" {}
data "databricks_node_type" "smallest" {
local_disk = true
}
The suggestion to add a depends_on for the
`databricks_mws_workspace.this` isn't possible to do as it is created in the root module where the `databricks.mws` provider is used. (The documentation says each module should isolate providers.)
ā11-06-2022 09:11 AM
So the answer to this was that you need to explicitly pass the provider argument to each of the data resources blocks. The docs should be updated to accommodate that. ā
i.e.
data "databricks_spark_version" "latest" {
provider = databricks.workspace
}
data "databricks_node_type" "smallest" {
provider = databricks.workspace
local_disk = true
}
ā11-05-2022 10:26 AM
I'm also curious about this as I haven't been able to successfully create an output for the databricks account_id.ā
ā11-06-2022 01:41 AM
That's odd, it makes me wonder if any attributes of that resource are getting exported, I suppose one could try to write the values out during an apply using a null resource and just echoing the values into a local file
ā11-06-2022 09:11 AM
So the answer to this was that you need to explicitly pass the provider argument to each of the data resources blocks. The docs should be updated to accommodate that. ā
i.e.
data "databricks_spark_version" "latest" {
provider = databricks.workspace
}
data "databricks_node_type" "smallest" {
provider = databricks.workspace
local_disk = true
}
ā11-07-2022 02:37 PM
Were you able to output the databricks account_id ?
ā11-07-2022 02:41 PM
I am going to be honest, I don't recall off the top of my head, but it is getting passed in as an argument to the other modules above so I assume so. I was able to verify that the other two were getting exported by adding a null resource that was something like:
resource "null_resource" "echo" {
local-exec = "echo '${module.root_mws.account_id}' > output.txt"
}
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonāt want to miss the chance to attend and share knowledge.
If there isnāt a group near you, start one and help create a community that brings people together.
Request a New Group