cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
cancel
Showing results for 
Search instead for 
Did you mean: 

Error: cannot create metastore data access: No metastore assigned for the current workspace.

Pat
Honored Contributor III

🐔 and 🐣 situation?

I am currently trying to came up with the way how to deploy Databricks with Terraform in multi-region, multi-tenant environment. I am not talking about simple cases like this (https://docs.databricks.com/data-governance/unity-catalog/automate.html).

Ideally I would like to have separate DEV UC and PROD UC at least, with multiple workspaces.

I've created some modules for re-usable resources:

metastore(uc)

s3

vpc

workspace

and I've planned my deployment this way:

datalake/dev_1

datalake/dev_2

datalake/prod_1

datalake/prod_2

datalake/global

dev_* and prod_* - different workspaces

global - metastore

The idea was to create first Unity Catalog (metastore) then in each datalake/env_* workspace attach workspace to the Unity Catalog, but it looks like we can create the Unity Catalog without the workspace, but there is no way to assign metastore_data_access to the Unity Catalog without the workspace.

--

Link to the example usage (https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/metastore_data_access#example-usage)

It seems like it should work.

## Create UC metastore
resource "databricks_metastore" "this" {
  provider      = databricks.workspace
  name          = "${local.prefix}-${var.workspace}-metastore-${var.region}-${var.env}"
  storage_root  = "s3://${var.aws_s3_bucket}/metastore"
  owner         = var.owner
  force_destroy = true
}
 
resource "databricks_metastore_data_access" "this" {
  provider     = databricks.workspace
  metastore_id = databricks_metastore.this.id
  name         = aws_iam_role.metastore_data_access.name
  aws_iam_role {
    role_arn = aws_iam_role.metastore_data_access.arn
  }
  is_default = true
}

In the Unity Catalog deployment blog (https://docs.databricks.com/data-governance/unity-catalog/automate.html#configure-a-metastore) you can see assignment of the ws to the metastore (or vice versa).

resource "databricks_metastore_assignment" "default_metastore" {
  depends_on           = [ databricks_metastore_data_access.metastore_data_access ]
  workspace_id         = var.default_metastore_workspace_id
  metastore_id         = databricks_metastore.metastore.id
  default_catalog_name = var.default_metastore_default_catalog_name
}

This is actually done first, so I guess the steps should be changed for better visibility:

variable "metastore_name" {}
variable "metastore_label" {}
variable "default_metastore_workspace_id" {}
variable "default_metastore_default_catalog_name" {}
 
resource "databricks_metastore" "metastore" {
  name          = var.metastore_name
  storage_root  = "s3://${aws_s3_bucket.metastore.id}/${var.metastore_label}"
  force_destroy = true
}
 
resource "databricks_metastore_assignment" "default_metastore" {
  depends_on           = [ databricks_metastore_data_access.metastore_data_access ]
  workspace_id         = var.default_metastore_workspace_id
  metastore_id         = databricks_metastore.metastore.id
  default_catalog_name = var.default_metastore_default_catalog_name
}
 
resource "databricks_metastore_data_access" "metastore_data_access" {
  depends_on   = [ databricks_metastore.metastore ]
  metastore_id = databricks_metastore.metastore.id
  name         = aws_iam_role.metastore_data_access.name
  aws_iam_role { role_arn = aws_iam_role.metastore_data_access.arn }
  is_default   = true
}
 
 

Let's get back to the question. How you are planning your terraform deployment with UC ? It would be great to learn how others are dealing with this.

I have 2 things in my mind,

  • have an admin workspace to deploy UC, but I guess I will need one per UC (so this is not going to work)
  • first deploy the workspace with Unity Catalog, then add other workspaces. I would need to create default workspace for example:
    • dev_datalake ws with Unity Catalog then other workspaces dev_analytics, dev_sandbox, etc are going to be deployed different way.

This means that my `global` idea won't work.

Looking forward to see some ideas.

4 REPLIES 4

Andrei_Radulesc
Contributor III

I got the same error

Error: cannot create metastore data access: No metastore assigned for the current workspace.

and fixed it by reversing the depends_on order. I am now making databricks_metastore_data_access depend on databricks_metastore_assignment. Here is my code:

resource "databricks_metastore" "this" {

 provider     = databricks.workspace

 name         = "${var.prefix}-metastore"

 storage_root = "s3://${aws_s3_bucket.metastore.id}/metastore"

 delta_sharing_scope = "INTERNAL"

 delta_sharing_recipient_token_lifetime_in_seconds = 120

 force_destroy = true

}

// Assign the metastore to workspaces

resource "databricks_metastore_assignment" "this" {

 provider    = databricks.workspace

 count       = length(var.workspaces)

 metastore_id = databricks_metastore.this.id

 workspace_id = tonumber(replace(var.workspaces[count.index], "/.*//", ""))

 depends_on  = [ databricks_metastore.this ]

}

resource "databricks_metastore_data_access" "metastore_data_access" {

 provider    = databricks.workspace

 metastore_id = databricks_metastore.this.id

 name        = aws_iam_role.metastore_data_access.name

 aws_iam_role { role_arn = aws_iam_role.metastore_data_access.arn }

 is_default  = true

 depends_on  = [ databricks_metastore_assignment.this ]

}

Pat
Honored Contributor III

Yes, I get this would help.

My problem is that ideally I would like to avoid assigning the metastore to workspace before databricks_metastore_data_access

My initial plan was to put unit catalog deployment to separate folder in the terraform structure and each workspace as well:

dev/

dev/uc

dev/ws_1

dev/ws_2

but I guess that I will need to re-think this. The order of the steps confused me a bit 🙂

Anonymous
Not applicable

Hi @Pat Sienkiewicz​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

pwc-aiq
New Contributor II

Hi, I'm experiencing a similar issue. I filed an Issue on Github here

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.