Delta Live Table name dynamically
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-27-2022 09:48 PM
Hi Team,
Can we pass Delta Live Table name dynamically [from a configuration file, instead of hardcoding the table name]? We would like to build a metadata-driven pipeline.
- Labels:
-
Delta Live Tables
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2022 06:12 AM
Yes, it is possible. Just pass the variable to @dlt.table(name=variable)
for name in ['table1', 'table2']:
@dlt.table(name=name)
def delta_live_table():
return (
spark.range(1, 10)
)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2022 06:56 AM
Thanks, @Hubert Dudek for your quick response on this, I can able to create DLT dynamically.
Can we pass the Database name while creating DLT tables instead of passing the database name in the pipeline configuration?
Error message :
org.apache.spark.sql.AnalysisException: Materializing tables in custom schemas is not supported. Please remove the database qualifier from table 'default.Delta_table3'.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-08-2022 07:21 AM
I hope this limitation is resolved - storing everything from one pipeline in a single database is not ideal. Preferably I'd like to be able to store bronze level data in it's own "Database" rather than mix with silver/gold.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-24-2022 10:15 PM
Hi @Dan Richardson There is a feature request for this limitation already in queue.This is the feature request ID: DB-I-5073. We do not have any ETA on it yet and will be implemented once prioritized . Please note that you won't be able to access the feature request as it is internal to Databricks, however you can always follow-up with above ID for the status update on this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-15-2022 01:55 PM
Hi @Dan Richardson,
Just a friendly follow-up. do you have any follow-up questions or did Noopur's response helped you? please let us know
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-17-2024 02:12 PM
Hi, have there been any updates on this feature or internal ticket? This would be a great addition. Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-27-2024 02:50 PM
I am observing same error while I adding dataset.tablename.
org.apache.spark.sql.catalyst.ExtendedAnalysisException: Materializing tables in custom schemas is not supported. Please remove the database qualifier from table 'streaming.dlt_read_test_files'
@Dlt.table(name="streaming.dlt_read_test_files")
def raw_data():
return spark.readStream.format("delta").load(abfss_location)
@dlt.table(name="streaming.dlt_clean_test_files")
def filtered_data():
return dlt.readStream("streaming.dlt_read_test_files").select(F.col("data"))
Do we have update on this topic?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-12-2024 02:20 PM
Hello,
I wonder if there is any update for this feature?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-17-2024 08:04 AM
This would be a great improvement to DLT. Majority of the architecture requirements I see, do separate bronze, silver and gold at the schema level. We can get around the issue separating the DLT pipelines into 3 separate ones, but you loose the ability to follow the pipeline end-to-end and add delays in processing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-11-2024 04:09 AM
Is this post referring to Direct Publishing Mode? As we are multi-tenanted we have to have separate schema per client, which currently means a single pipeline per client. This is not cost effective at all, so we are very much reliant on DPM. I believe it is not due to go in to public preview now until February at the earliest. It would be good if this can be fast-tracked if more people need it.