ā11-05-2022 10:19 AM
I have been following the documentation on the terraform databricks documentation in order to provision account level resources on AWS. I can create the workspace fine, add users, etc... However, when I go to use the provider in non-mws mode, I am receiving errors saying:
Error: workspace is most likely not created yet, because the `host` is empty. Please add `depends_on = [databricks_mws_workspaces.this]` or `depends_on = [azurerm_databricks_workspace.this]` to every data resource. See https://www.terraform.io/docs/language/resources/behavior.html more info. Please check https://registry.terraform.io/providers/databricks/databricks/latest/docs#authentication for details
ā
ā with module.workspaces.data.databricks_spark_version.latest,
ā on ../modules/aws_workspaces/init.tf line 11, in data "databricks_spark_version" "latest":
ā 11: data "databricks_spark_version" "latest" {}
To show how this created I have a file called root.tf which creates the root mws level resources absolutely fine.
provider "databricks" {
alias = "mws"
host = "https://accounts.cloud.databricks.com"
username = var.databricks_account_username
password = var.databricks_account_password
}
module "root" {
source = "../modules/aws_root"
databricks_account_id = var.databricks_account_id
tags = var.tags
region = var.region
cidr_block = var.cidr_block
databricks_users = var.databricks_users
databricks_metastore_admins = var.databricks_metastore_admins
unity_admin_group = var.unity_admin_group
providers = {
databricks.mws = databricks.mws
}
}
With outputs coming from that module:
output "databricks_host" {
value = databricks_mws_workspaces.this.workspace_url
}
output "databricks_token" {
value = databricks_mws_workspaces.this.token[0].token_value
sensitive = true
}
output "databricks_workspace_id" {
value = databricks_mws_workspaces.this.workspace_id
sensitive = false
}
output "databricks_account_id" {
value = databricks_mws_workspaces.this.account_id
sensitive = true
}
output "aws_iam_role_metastore_data_access_arn" {
value = aws_iam_role.metastore_data_access.arn
}
output "aws_iam_role_metastore_data_access_name" {
value = aws_iam_role.metastore_data_access.name
}
output "aws_s3_bucket_metastore_id" {
value = aws_s3_bucket.metastore.id
}
These can be seen in the created resources when I do a
`terraform state show <outputs>`
However, when I go to create a workspace level provider to create some notebooks, clusters, etc... I seem to be unable to get the child module resources to use the newly created provider with a host, even though it is being set and I can see it's value. Even hard coding the host does not work. They all output the above error.
The creation of this provider and module can be seen here:
provider "databricks" {
alias = "workspace"
host = module.root.databricks_host
token = module.root.databricks_token
account_id = module.root.databricks_account_id
}
module "workspaces" {
source = "../modules/aws_workspaces"
aws_s3_bucket_metastore_id = module.root.aws_s3_bucket_metastore_id
aws_iam_role_metastore_data_access_arn = module.root.aws_iam_role_metastore_data_access_arn
aws_iam_role_metastore_data_access_name = module.root.aws_iam_role_metastore_data_access_name
cidr_block = var.cidr_block
databricks_account_id = var.databricks_account_id
databricks_bronze_users = var.databricks_bronze_users
databricks_gold_users = var.databricks_gold_users
databricks_host = module.root.databricks_host
databricks_metastore_admins = var.databricks_metastore_admins
databricks_silver_users = var.databricks_silver_users
databricks_token = module.root.databricks_token
databricks_users = var.databricks_users
databricks_workspace_id = module.root.databricks_workspace_id
python_module_version_number = local.python_module_version_number
shed_databricks_egg_name = var.shed_databricks_egg_name
tags = var.tags
unity_admin_group = var.unity_admin_group
config = local.config
depends_on = [
module.root
]
providers = {
databricks = databricks.workspace
}
}
And the init method that uses this provider and is throwing the error can be seen here:
terraform {
required_providers {
databricks = {
source = "databricks/databricks"
version = "~> 1.6.2"
configuration_aliases = [databricks.workspace]
}
}
}
data "databricks_spark_version" "latest" {}
data "databricks_node_type" "smallest" {
local_disk = true
}
The suggestion to add a depends_on for the
`databricks_mws_workspace.this` isn't possible to do as it is created in the root module where the `databricks.mws` provider is used. (The documentation says each module should isolate providers.)
ā11-06-2022 09:11 AM
So the answer to this was that you need to explicitly pass the provider argument to each of the data resources blocks. The docs should be updated to accommodate that. ā
i.e.
data "databricks_spark_version" "latest" {
provider = databricks.workspace
}
data "databricks_node_type" "smallest" {
provider = databricks.workspace
local_disk = true
}
ā11-05-2022 10:26 AM
I'm also curious about this as I haven't been able to successfully create an output for the databricks account_id.ā
ā11-06-2022 01:41 AM
That's odd, it makes me wonder if any attributes of that resource are getting exported, I suppose one could try to write the values out during an apply using a null resource and just echoing the values into a local file
ā11-06-2022 09:11 AM
So the answer to this was that you need to explicitly pass the provider argument to each of the data resources blocks. The docs should be updated to accommodate that. ā
i.e.
data "databricks_spark_version" "latest" {
provider = databricks.workspace
}
data "databricks_node_type" "smallest" {
provider = databricks.workspace
local_disk = true
}
ā11-07-2022 02:37 PM
Were you able to output the databricks account_id ?
ā11-07-2022 02:41 PM
I am going to be honest, I don't recall off the top of my head, but it is getting passed in as an argument to the other modules above so I assume so. I was able to verify that the other two were getting exported by adding a null resource that was something like:
resource "null_resource" "echo" {
local-exec = "echo '${module.root_mws.account_id}' > output.txt"
}
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.