cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Yash_542965
by New Contributor II
  • 483 Views
  • 0 replies
  • 0 kudos

DLT aggregation problem

I'm utilizing SQL to perform aggregation operations within a gold layer of a DLT pipeline. However, I'm encountering an error when running the pipeline while attempting to return a data frame using spark.sql.Could anyone please assist me with the SQL...

  • 483 Views
  • 0 replies
  • 0 kudos
Prachi_Sankhala
by New Contributor
  • 5845 Views
  • 7 replies
  • 1 kudos

Resolved! What are the advantages of using Delta Live tables (DLT) over Data Build Tool (dbt) in Databricks?

Please explain with some use cases which show the difference between DLT and dbt.

  • 5845 Views
  • 7 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Prachi Sankhala​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

  • 1 kudos
6 More Replies
dsldan
by New Contributor II
  • 1879 Views
  • 2 replies
  • 1 kudos

Resolved! DLT setup taking longer than actually building the tables

Hi all!We are using DLT for our ETL jobs, and we're noticing the setup steps (Initializing, Resetting tables. Setting up tables, Rendering graph) are taking much longer than actually ETL'ing the data into our tables. We have about 110 tables combined...

  • 1879 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @daan duppen​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 1 kudos
1 More Replies
guostong
by New Contributor III
  • 946 Views
  • 3 replies
  • 1 kudos

Issues to load from ADLS in DLT

I am using DLT to load csv in ADLS, below is my sql query in notebook:CREATE OR REFRESH STREAMING LIVE TABLE test_account_raw AS SELECT * FROM cloud_files( "abfss://my_container@my_storageaccount.dfs.core.windows.net/test_csv/", "csv", map("h...

  • 946 Views
  • 3 replies
  • 1 kudos
Latest Reply
guostong
New Contributor III
  • 1 kudos

thank you every one, the problem is resolved, problem is gone when I have workspace admin access.

  • 1 kudos
2 More Replies
BWong
by New Contributor III
  • 2557 Views
  • 2 replies
  • 1 kudos

Overwriting schema in Delta Live Tables

Hi allI have a table created by DLT. Initially I specified cloudFiles.inferColumnTypes to false and all columns are stored as strings. However, I now want to use cloudFiles.inferColumnTypes=true. I dropped the table and re-ran the pipeline, which fai...

  • 2557 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Billy Wong​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 1 kudos
1 More Replies
BenLambert
by Contributor
  • 1437 Views
  • 1 replies
  • 0 kudos

How to deal with deleted files in source directory in DLT?

We have a DLT pipeline that uses the autoloader to detect files added to a source storage bucket. It reads these updated files and adds new records to a bronze streaming table. However we would also like to automatically delete records from the bronz...

  • 1437 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Bennett Lambert​ :Yes, it is possible to automatically delete records from the bronze table when a source file is deleted, without doing a full refresh. One way to achieve this is by using the Change Data Capture (CDC) feature in Databricks Delta.CD...

  • 0 kudos
PearceR
by New Contributor III
  • 3699 Views
  • 2 replies
  • 1 kudos

Resolved! custom upsert for delta live tables apply_changes()

Hello community :).I am currently implementing some pipelines using DLT. They are working great for my medalion architecture for landed json in bronze -> silver (using apply_changes) then materialized gold views ontop.However, I am attempting to crea...

  • 3699 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Robert Pearce​ :It is possible to achieve the desired behavior using apply_changes in Databricks Delta Lake. You can use the merge operation to merge data from your source into your target Delta table, and then use whenMatchedUpdate to update the id...

  • 1 kudos
1 More Replies
J_M_W
by Contributor
  • 2761 Views
  • 3 replies
  • 3 kudos

Resolved! Can you use %run or dbutils.notebook.run in a Delta Live Table pipeline?

Hi there, Can you use a %run or dbutils.notebook.run() in a Delta Live Table (DLT) pipeline?When I try, I get the following error: "IllegalArgumentException: requirement failed: To enable notebook workflows, please upgrade your Databricks subscriptio...

  • 2761 Views
  • 3 replies
  • 3 kudos
Latest Reply
J_M_W
Contributor
  • 3 kudos

Hi all.@Kaniz Fatma​ thanks for your answer. I am on the premium pricing tier in Azure.After digging around the logs it would seem that you cannot run magic commands in a Delta Live Table pipeline. Therefore, you cannot use %run in a DLT pipeline - w...

  • 3 kudos
2 More Replies
YSF
by New Contributor III
  • 650 Views
  • 1 replies
  • 0 kudos

Delta Live Table & Autoloader adding a non-existent column

I'm trying to setup autoloader to read some csv files. I tried with both autoloader with the DLT decorator as well as just autoloader by itself. The first column of the data is called "run_id", when I do a spark.read.csv() directly on the file it com...

  • 650 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rishabh264
Honored Contributor II
  • 0 kudos

can you attach the exact output so that I can have a look on that .

  • 0 kudos
YSF
by New Contributor III
  • 407 Views
  • 1 replies
  • 0 kudos

Any elegant pattern for Autoloader/DLT development?

Does anyone have a workflow or pattern that works for developing with autoloader/DLT? I'm still new to but the fact that while testing it's creating checkpoints using schema locations makes it really tricky to develop with and hammer out a working ve...

  • 407 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rishabh264
Honored Contributor II
  • 0 kudos

what basically you are refering here as pattern .

  • 0 kudos
kinsun
by New Contributor II
  • 936 Views
  • 3 replies
  • 0 kudos

Resolved! Delta Live Table Service Upgrade

Dear experts, Might I know what will happen to the delta live table pipeline which is in a cancelled state, when there is a runtime service upgrade? Thanks!

  • 936 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@KS LAU​ :When a runtime service upgrade occurs in Databricks, any running tasks or pipelines may be temporarily interrupted while the upgrade is being applied. In the case of a cancelled Delta Live Table pipeline, it will not be impacted by the upgr...

  • 0 kudos
2 More Replies
GuMart
by New Contributor III
  • 2583 Views
  • 5 replies
  • 2 kudos

Resolved! DLT target schema - get value during run time

Hi,I would like to know if it is possible to get the target schema, programmatically, inside a DLT.In DLT pipeline settings, destination, target schema.I want to run more idempotent pipelines. For example, my target table has the fields: reference_da...

  • 2583 Views
  • 5 replies
  • 2 kudos
Latest Reply
GuMart
New Contributor III
  • 2 kudos

Thank you @Suteja Kanuri​ ,Looks like you solution is working, thank you.Regards,

  • 2 kudos
4 More Replies
Khalil
by Contributor
  • 3442 Views
  • 6 replies
  • 5 kudos

Resolved! Pivot a DataFrame in Delta Live Table DLT

I wanna apply a pivot on a dataframe in DLT but I'm having the following warningNotebook:XXXX used `GroupedData.pivot` function that will be deprecated soon. Please fix the notebook.I have the same warning if I use the the function collect.Is it risk...

  • 3442 Views
  • 6 replies
  • 5 kudos
Latest Reply
Khalil
Contributor
  • 5 kudos

Thanks @Kaniz Fatma​  for your support.The solution was to do the pivot outside of views or tables and the warning disappeared.

  • 5 kudos
5 More Replies
sika
by New Contributor II
  • 3914 Views
  • 4 replies
  • 0 kudos

ignoreDeletes in DLT pipeline

Hi all,I have a DLT pipeline as so:raw -> cleansed (SCD2) -> curated. 'Raw' is utilizing autoloader, to continously read file from a datalake. These files can contain tons of duplicate, which causes our raw table to become quite large. Therefore, we ...

  • 3914 Views
  • 4 replies
  • 0 kudos
Latest Reply
sika
New Contributor II
  • 0 kudos

Ok, i'll try an add additional details. Firstly: The diagram below shows our current dataflow: Our raw table is defined as such: TABLES = ['table1','table2']   def generate_tables(table_name): @dlt.table( name=f'raw_{table_name}', table_pro...

  • 0 kudos
3 More Replies
Labels