cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Governance
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to create Unity Catalog tables/views via Terraform?

DBedrenko
New Contributor III

Hello, there are a bunch of pages in the documentation that mention tables and views can be created via Terraform:

> You can also create a managed table by using the Databricks Terraform provider and databricks_table

But those links to `databricks_table` and `databricks_view` that lead to the TF provider documentation seems to say that these resources are no longer provided?

> Page Not Found: This documentation page doesn't exist for version 1.14.3 of the databricks provider.

Are the Databricks docs out of date? Then is there another way to create Unity Catalog tables or views via TF?

My concrete problem is I have some configuration data in Terraform files that needs to be used to create a specific view. The TF configuration is obviously not accessible from the Databricks workspace so I can't create the view from the workspace, hence I'm trying to create the view via TF where the configuration data is readily available.

I already create via TF the catalogs, schemas and everything else to set up Databricks.

If tables and views cannot be created via TF, is there some other way that data defined in TF files can be exposed to the Databricks workspace? Then I could create my views in the workspace environment (via Python or SQL).

1 ACCEPTED SOLUTION

Accepted Solutions

DBedrenko
New Contributor III

Hmm it seems that Databricks developers say that creating tables/views in Unity Catalog from TerraForm is discouraged:

> so there are quite a few gaps/edge cases with the tables API, hence customers should not use the API or Terraform to create/manage Unity Catalog tables & views at the moment.

So then I think the best way to create tables/views is via a Job, until such time that Databricks offers a stable API to do this via Terraform.

For posterity, there's also another way to communicate data from TF to Databricks: via K8s secrets.

View solution in original post

4 REPLIES 4

Anonymous
Not applicable

@Daniel Bedrenkoโ€‹ :

You can create Unity Catalog tables and views via Terraform using the

databricks_sql_script resource. With this resource, you can define the SQL script that creates the table or view and run it via Terraform. Here is an example of how you can create a view using this resource:

resource "databricks_sql_script" "example_view" {
  database = "example_db"
  content = "CREATE VIEW example_view AS SELECT * FROM example_table WHERE id > 10;"
}

In this example, we are creating a view called example_view in the example_db database, and the view selects all rows from example_table where the id column is greater than 10.

Alternatively, if you have data defined in TF files that needs to be exposed to the Databricks workspace, you can use the databricks_notebook resource to create a Python or SQL notebook that includes the data and creates the view in the workspace. Here is an example of how you can create a Python notebook that creates a view:

resource "databricks_notebook" "example_notebook" {
  name = "example_notebook"
  language = "PYTHON"
  content_base64 = base64encode(templatefile("example_notebook.py.tpl", {
    example_data = var.example_data
  }))
}
 
data "template_file" "example_notebook_py" {
  template = <<EOF
    # MAGIC %sql
    from pyspark.sql.functions import *
    from pyspark.sql.types import *
 
    example_data = ${jsonencode(var.example_data)}
 
    df = spark.createDataFrame(example_data, [
      StructField("id", IntegerType(), True),
      StructField("name", StringType(), True),
      StructField("age", IntegerType(), True)
    ])
 
    df.createOrReplaceTempView("example_view")
  EOF
 
  vars = {
    example_data = var.example_data
  }
}
 
variable "example_data" {
  type = any
}
 

In this example, we are creating a Python notebook called example_notebook that takes in a variable called example_data, which contains the data that needs to be exposed to the workspace. The

content_base64 field contains the base64-encoded contents of the notebook, which is generated from a template file. The example_notebook.py.tpl template file contains the Python code that creates the view using the data from the example_data variable.

I hope this helps, and let me know if you have any further questions!

DBedrenko
New Contributor III

Thank you very much for the thorough answer, it is so helpful!

1 question: you mention a `databricks_sql_script` resource, but I can find no such resource in the Provider docs. I can see `databricks_sql_query` , but it doesn't have the `database` and `content` fields like in your example, so I"m not sure if this is the one you meant.

DBedrenko
New Contributor III

Hmm it seems that Databricks developers say that creating tables/views in Unity Catalog from TerraForm is discouraged:

> so there are quite a few gaps/edge cases with the tables API, hence customers should not use the API or Terraform to create/manage Unity Catalog tables & views at the moment.

So then I think the best way to create tables/views is via a Job, until such time that Databricks offers a stable API to do this via Terraform.

For posterity, there's also another way to communicate data from TF to Databricks: via K8s secrets.

Anonymous
Not applicable

@Daniel Bedrenkoโ€‹ :

Yes, you are correct. The Databricks developers discourage using the API or Terraform to create and manage Unity Catalog tables and views due to gaps and edge cases with the tables API. Instead, they recommend using Jobs to create and manage tables and views.

Using Jobs is a good alternative to using Terraform to create tables and views. You can create a Job with a script that creates tables or views, and then use Terraform to manage the Job. This allows you to create and manage tables and views from Terraform while avoiding the gaps and edge cases of the tables API.

Regarding the communication of data from Terraform to Databricks, using Kubernetes secrets is a good option. You can create a Kubernetes secret with the configuration data that needs to be used in the Databricks workspace, and then pass the secret to your Databricks cluster as an environment variable or a mounted volume.

Overall, using Jobs and Kubernetes secrets are good solutions to create and manage tables and views in the Unity Catalog from Terraform and communicate data from Terraform to Databricks.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.