cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 

Delta Live Tables: dynamic schema

cpayne_vax
New Contributor III

Does anyone know if there's a way to specify an alternate Unity schema in a DLT workflow using the @Dlt.table syntax? In my case, Iā€™m looping through folders in Azure datalake storage to ingest data. Iā€™d like those folders to get created in different schemas within a single (or multiple) unity catalog.

For example:

  • Catalog1 : schemaA : tableA
  • Catalog1 : schemaB : tableB
  • Catalog2 : schemaC : tableC

I tried specifying the 3 dot nomenclature in the table name in my @Dlt.table syntax and received an error ā€œINVALID_PARAMETER_VALUE.INVALID_FIELD CreateStagingTable name is not a valid name.ā€ Specifying the ā€œpathā€ attribute also returns an error that I canā€™t do that with Unity. When creating the workflow Iā€™m allowed to leave the schema blank but I canā€™t figure out how to specify it in my code.

Iā€™ve been googling around for a couple days and couldnā€™t find an answer. Would love if you could suggest a solution that didnā€™t involve having to create separate workflows for each schema. 

1 ACCEPTED SOLUTION

Accepted Solutions

cpayne_vax
New Contributor III

For anyone following along, I heard back from our Databricks rep and the training team that this is, in fact, not possible today as Kaniz suggests. 

> When configuring your Delta Live Tables pipeline with UC, you are prompted to select a UC catalog and schema. All tables created in the DLT pipeline will be created under that UC catalog and schema - you cannot create tables outside of this catalog / schema within the same DLT pipeline.

They recommend creating a separate pipeline for each Unity schema you need to update, and that there may be a potential fix to this sometime in Q1 2024. In the meantime, I'll move forward the long way. 

View solution in original post

8 REPLIES 8

Thank you for the response @Retired_mod. It was close, but not exactly what I needed. I already know most of what you mentioned, except this one sentence:


You can also specify the catalog and schema in your code using the @Dlt.table syntax.

This is the part that I can't figure out. How do you do this using @dlt syntax? I can't find any documentation on this, and when I try using "3 dots" to specify this, I receive errors. The "path" attribute also doesn't work for this. In the DLT python reference there is no reference to this. Can you explain how to do this?

cpayne_vax
New Contributor III

Hello, just checking in here. Would love if there was a solution. I tried using a "catalog" parameter in my @Dlt syntax but of course that didn't work.

cpayne_vax
New Contributor III

For anyone following along, I heard back from our Databricks rep and the training team that this is, in fact, not possible today as Kaniz suggests. 

> When configuring your Delta Live Tables pipeline with UC, you are prompted to select a UC catalog and schema. All tables created in the DLT pipeline will be created under that UC catalog and schema - you cannot create tables outside of this catalog / schema within the same DLT pipeline.

They recommend creating a separate pipeline for each Unity schema you need to update, and that there may be a potential fix to this sometime in Q1 2024. In the meantime, I'll move forward the long way. 

data-engineer-d
Contributor

@cpayne_vax now that we are at end of Q1-24, do we have the ability to write to any schema dynamically?

mhscience525
New Contributor III

I haven't found the way so far to do that. I should create separate pipelines.

Someone,  any update about the possibility to write to different  schemas using same pipeline? 

manish1987c
New Contributor III

Please let me know if there is way by which we can define schema dynamically in 1 delta live table pipeline in same catalog  as below 

 

For example:

  • Catalog1 : schemaA : tableA
  • Catalog1 : schemaB : tableB

 

 

Taja
New Contributor II

Databricks announced that this feature will be available in public preview in Q4.2024. With this will be possible publish tables to arbitrary catalogs and schemas from one single DLT pipeline.

surajitDE
New Contributor II

Q4.2024 is going to end still i donot see any updates for this feature

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonā€™t want to miss the chance to attend and share knowledge.

If there isnā€™t a group near you, start one and help create a community that brings people together.

Request a New Group