cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Tables: dynamic schema

cpayne_vax
New Contributor III

Does anyone know if there's a way to specify an alternate Unity schema in a DLT workflow using the @Dlt.table syntax? In my case, I’m looping through folders in Azure datalake storage to ingest data. I’d like those folders to get created in different schemas within a single (or multiple) unity catalog.

For example:

  • Catalog1 : schemaA : tableA
  • Catalog1 : schemaB : tableB
  • Catalog2 : schemaC : tableC

I tried specifying the 3 dot nomenclature in the table name in my @Dlt.table syntax and received an error “INVALID_PARAMETER_VALUE.INVALID_FIELD CreateStagingTable name is not a valid name.” Specifying the “path” attribute also returns an error that I can’t do that with Unity. When creating the workflow I’m allowed to leave the schema blank but I can’t figure out how to specify it in my code.

I’ve been googling around for a couple days and couldn’t find an answer. Would love if you could suggest a solution that didn’t involve having to create separate workflows for each schema. 

1 ACCEPTED SOLUTION

Accepted Solutions

cpayne_vax
New Contributor III

For anyone following along, I heard back from our Databricks rep and the training team that this is, in fact, not possible today as Kaniz suggests. 

> When configuring your Delta Live Tables pipeline with UC, you are prompted to select a UC catalog and schema. All tables created in the DLT pipeline will be created under that UC catalog and schema - you cannot create tables outside of this catalog / schema within the same DLT pipeline.

They recommend creating a separate pipeline for each Unity schema you need to update, and that there may be a potential fix to this sometime in Q1 2024. In the meantime, I'll move forward the long way. 

View solution in original post

15 REPLIES 15

Thank you for the response @Retired_mod. It was close, but not exactly what I needed. I already know most of what you mentioned, except this one sentence:


You can also specify the catalog and schema in your code using the @Dlt.table syntax.

This is the part that I can't figure out. How do you do this using @dlt syntax? I can't find any documentation on this, and when I try using "3 dots" to specify this, I receive errors. The "path" attribute also doesn't work for this. In the DLT python reference there is no reference to this. Can you explain how to do this?

cpayne_vax
New Contributor III

Hello, just checking in here. Would love if there was a solution. I tried using a "catalog" parameter in my @Dlt syntax but of course that didn't work.

cpayne_vax
New Contributor III

For anyone following along, I heard back from our Databricks rep and the training team that this is, in fact, not possible today as Kaniz suggests. 

> When configuring your Delta Live Tables pipeline with UC, you are prompted to select a UC catalog and schema. All tables created in the DLT pipeline will be created under that UC catalog and schema - you cannot create tables outside of this catalog / schema within the same DLT pipeline.

They recommend creating a separate pipeline for each Unity schema you need to update, and that there may be a potential fix to this sometime in Q1 2024. In the meantime, I'll move forward the long way. 

data-engineer-d
Contributor

@cpayne_vax now that we are at end of Q1-24, do we have the ability to write to any schema dynamically?

mhscience525
New Contributor III

I haven't found the way so far to do that. I should create separate pipelines.

Someone,  any update about the possibility to write to different  schemas using same pipeline? 

manish1987c
New Contributor III

Please let me know if there is way by which we can define schema dynamically in 1 delta live table pipeline in same catalog  as below 

 

For example:

  • Catalog1 : schemaA : tableA
  • Catalog1 : schemaB : tableB

 

 

Taja
New Contributor II

Databricks announced that this feature will be available in public preview in Q4.2024. With this will be possible publish tables to arbitrary catalogs and schemas from one single DLT pipeline.

surajitDE
New Contributor III

Q4.2024 is going to end still i donot see any updates for this feature

kuldeep-in
Databricks Employee
Databricks Employee

'Direct Publishing Mode' Public Preview is now live on all production regions. This feature will allow you to write to multiple schemas & catalogs from the same pipeline.

user1234567899
New Contributor II

Hi @kuldeep-in

Could you please point me in the right direction on how to use it correctly? 

Getting this error:
com.databricks.pipelines.common.errors.DLTAnalysisException: Materializing tables in custom schemas is not supported. Please remove the database qualifier from table 'catalog.schema.table_name'.

Thank you

kuldeep-in
Databricks Employee
Databricks Employee

@user1234567899 Make sure to enable DPM from Previews page.

DPM Preview.png

Once enabled you should be able to use schema name in DLT.

@kuldeep-in , despite enabling this, I am seeing the same error that custom schemas are nut supported.

kuldeep-in
Databricks Employee
Databricks Employee

@Blitz8854 Can you confirm the cloud region and DLT creation method (UI or other).

Blitz8854
New Contributor II

For folks who have been struggling with this, pay close attention to your DABs deployment. You must use schema rather than target in your deployment YML. When you set the schema, it will in fact work. Also, verify that your pipeline has a checkmark when you're in the settings indicating that DPM Mode is enabled (this will look a lot like when a Cluster indicates that it is UC compatible).