cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Tables: dynamic schema

cpayne_vax
New Contributor III

Does anyone know if there's a way to specify an alternate Unity schema in a DLT workflow using the @Dlt.table syntax? In my case, I’m looping through folders in Azure datalake storage to ingest data. I’d like those folders to get created in different schemas within a single (or multiple) unity catalog.

For example:

  • Catalog1 : schemaA : tableA
  • Catalog1 : schemaB : tableB
  • Catalog2 : schemaC : tableC

I tried specifying the 3 dot nomenclature in the table name in my @Dlt.table syntax and received an error “INVALID_PARAMETER_VALUE.INVALID_FIELD CreateStagingTable name is not a valid name.” Specifying the “path” attribute also returns an error that I can’t do that with Unity. When creating the workflow I’m allowed to leave the schema blank but I can’t figure out how to specify it in my code.

I’ve been googling around for a couple days and couldn’t find an answer. Would love if you could suggest a solution that didn’t involve having to create separate workflows for each schema. 

1 ACCEPTED SOLUTION

Accepted Solutions

cpayne_vax
New Contributor III

For anyone following along, I heard back from our Databricks rep and the training team that this is, in fact, not possible today as Kaniz suggests. 

> When configuring your Delta Live Tables pipeline with UC, you are prompted to select a UC catalog and schema. All tables created in the DLT pipeline will be created under that UC catalog and schema - you cannot create tables outside of this catalog / schema within the same DLT pipeline.

They recommend creating a separate pipeline for each Unity schema you need to update, and that there may be a potential fix to this sometime in Q1 2024. In the meantime, I'll move forward the long way. 

View solution in original post

5 REPLIES 5

Kaniz
Community Manager
Community Manager

Hi @cpayne_vax, According to the Databricks documentation, you can use Unity Catalog with your Delta Live Tables (DLT) pipelines to define a catalog and schema where your pipeline will persist tables. You can also read data from Unity Catalog tables and share materialized views (live tables) with other users. To use Unity Catalog with your DLT pipelines, you need to have the appropriate privileges on the tar....

 

When creating a DLT pipeline, in the UI, select “Unity Catalog” in the Destination options. You will be prompted to choose your target catalog and schema, which is where all your live tables w.... You can also specify the catalog and schema in your code using the @Dlt.table syntax.

 

Note that a single pipeline cannot write to both the Hive metastore and Unity Catalog, and existing ....

 

I hope this helps you with your DLT workflow. For more information, you can check out the Databricks blog post that explains how to build governed pipelines with DLT and Unity Catalog, or the Microsoft Learn module that provides best practices for using Unity Catalog. 

cpayne_vax
New Contributor III

Thank you for the response @Kaniz. It was close, but not exactly what I needed. I already know most of what you mentioned, except this one sentence:


You can also specify the catalog and schema in your code using the @Dlt.table syntax.

This is the part that I can't figure out. How do you do this using @dlt syntax? I can't find any documentation on this, and when I try using "3 dots" to specify this, I receive errors. The "path" attribute also doesn't work for this. In the DLT python reference there is no reference to this. Can you explain how to do this?

cpayne_vax
New Contributor III

Hello, just checking in here. Would love if there was a solution. I tried using a "catalog" parameter in my @Dlt syntax but of course that didn't work.

cpayne_vax
New Contributor III

For anyone following along, I heard back from our Databricks rep and the training team that this is, in fact, not possible today as Kaniz suggests. 

> When configuring your Delta Live Tables pipeline with UC, you are prompted to select a UC catalog and schema. All tables created in the DLT pipeline will be created under that UC catalog and schema - you cannot create tables outside of this catalog / schema within the same DLT pipeline.

They recommend creating a separate pipeline for each Unity schema you need to update, and that there may be a potential fix to this sometime in Q1 2024. In the meantime, I'll move forward the long way. 

data-engineer-d
New Contributor III

@cpayne_vax now that we are at end of Q1-24, do we have the ability to write to any schema dynamically?