Data Engineering

Forum Posts

Sorted by:

by SimonXu • New Contributor II

12-01-2022 7:31:36 PM

11041 Views
7 replies
15 kudos

Resolved! Failed to launch pipeline cluster

Hi, there. I encountered an issue when I was trying to create my delta live table pipeline. The error is "DataPlaneException: Failed to launch pipeline cluster 1202-031220-urn0toj0: Could not launch cluster due to cloud provider failures. azure_error...

Data Engineering

11041 Views
7 replies
15 kudos

12-01-2022 7:31:36 PM

View Replies

Latest Reply

Yaadhu
New Contributor II

12-12-2024 11:59:16 AM

15 kudos

you can create the pool instance in the databricks under compute/pool and assign the value in the json of the DLT pipeline. With this, we will control on pool min workers and max workers and the reuse of the pools available by other pipelines. "node_...

15 kudos

12-12-2024 11:59:16 AM

6 More Replies

by sarguido • New Contributor II

02-21-2023 5:13:09 AM

3984 Views
5 replies
2 kudos

Delta Live Tables: bulk import of historical data?

Hello! I'm very new to working with Delta Live Tables and I'm having some issues. I'm trying to import a large amount of historical data into DLT. However letting the DLT pipeline run forever doesn't work with the database we're trying to import from...

Data Engineering

3984 Views
5 replies
2 kudos

02-21-2023 5:13:09 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-21-2023 11:31:20 PM

2 kudos

Hi @Sarah Guido Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

2 kudos

04-21-2023 11:31:20 PM

4 More Replies

by daz • New Contributor III

07-26-2022 4:30:02 PM

7142 Views
9 replies
3 kudos

DLT managed by non-existent pipeline

I am building out a new DLT pipeline and have since had to rebuild it from scratch. Having deleted the old pipeline and constructed a new one I now get this error:Table 'X' is already managed by pipeline 'Y'. As I only have the one pipeline how would...

Data Engineering

7142 Views
9 replies
3 kudos

07-26-2022 4:30:02 PM

View Replies

Latest Reply

Shinaider777
New Contributor II

05-01-2024 10:34:38 AM

3 kudos

rename your function from @Dlt.table, for exemple:@Dlt.table( comment="exemple", table_properties={"exemple": "exemple"}, partition_cols=["a", "b", "c"])def modify_this_name():

3 kudos

05-01-2024 10:34:38 AM

8 More Replies

by rdobbss • New Contributor II

07-11-2022 8:20:14 AM

4729 Views
3 replies
3 kudos

How to use foreachbatch in deltalivetable or DLT?

I need to process some transformation on incoming data as a batch and want to know if there is way to use foreachbatch option in deltalivetable. I am using autoloader to load json files and then I need to apply foreachbatch and store results into ano...

Data Engineering

4729 Views
3 replies
3 kudos

07-11-2022 8:20:14 AM

View Replies

Latest Reply

TomRenish
New Contributor III

01-18-2023 11:33:47 AM

3 kudos

Not sure if this will apply to you or not...I was looking at the foreachbatch tool to reduce the workload of getting distinct data from a history table of 20million + records because the df.dropDuplicates() function was intermittently running out of ...

3 kudos

01-18-2023 11:33:47 AM

2 More Replies

by Chhaya • New Contributor III

06-19-2023 2:56:49 AM

2489 Views
2 replies
2 kudos

DLT config/setting json support

hi team,There used to be option to provide DLT pipeline settings either via UI or JSON, but I do not see it anymore after switching to new UI. Is this something expected ? am I missing something ? here is screenshot for reference.

Data Engineering

2489 Views
2 replies
2 kudos

06-19-2023 2:56:49 AM

View Replies

Latest Reply

User16752245772
Contributor

06-20-2023 9:23:49 PM

2 kudos

Hi @Chhaya Vishwakarma This option is available, could you please clear the browser cache and try ? or can you try in an incognito window?

2 kudos

06-20-2023 9:23:49 PM

1 More Replies

by alemo • New Contributor III

06-19-2023 1:47:19 AM

1090 Views
1 replies
1 kudos

DLT started by SERVICE_UPGRADE

HelloI'm developing a dlt pipeline, configured in continuous mode.I'm still in dev mode, so I stop my pipeline when i'm not working on it.My problem is that the pipeline is frequently started by SERVICE_UPGRADE.example of message:'Update xxxxx starte...

Data Engineering

1090 Views
1 replies
1 kudos

06-19-2023 1:47:19 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-19-2023 9:38:28 PM

1 kudos

Hi @alex mo Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

1 kudos

06-19-2023 9:38:28 PM

by Rishabh_T • New Contributor III

06-05-2023 10:53:59 PM

4784 Views
4 replies
5 kudos

Resolved! DLT pipeline is unable to process struct with hyphen in nested column name

Hello,I have some nested columns with hyphen i.e. sample-1 in struct column, recently DLT pipeline has started throwing synatx error. Before May 24, 2023, this was working fine.Is this a new bug in May 2023 release?After clearing table and table's da...

Data Engineering

4784 Views
4 replies
5 kudos

06-05-2023 10:53:59 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 12:06:17 AM

5 kudos

Hi @Rishabh Tomar We haven't heard from you since the last response from @Kaniz Fatma . Kindly share the information with us, and in return, we will provide you with the necessary solution. Thanks and Regards

5 kudos

06-14-2023 12:06:17 AM

3 More Replies

by js54123875 • New Contributor III

06-01-2023 5:45:02 PM

3989 Views
3 replies
3 kudos

Setup for Unity Catalog, autoloader, three-level namespace, SCD2

I am trying to setup delta live tables pipelines to ingest data to bronze and silver tables. Bronze and Silver are separate schema. This will be triggered by a daily job. It appears to run fine when set as continuous, but fails when triggered.Table...

Data Engineering

3989 Views
3 replies
3 kudos

06-01-2023 5:45:02 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 12:17:18 AM

3 kudos

Hi @Jennette Shepard Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

3 kudos

06-14-2023 12:17:18 AM

2 More Replies

by Mado • Valued Contributor II

12-19-2022 2:09:59 PM

6557 Views
1 replies
0 kudos

Resolved! Error when query a table created by DLT pipeline; "Couldn't find value of a column"

Hi, I create a table using DLT pipeline (triggered once). In the ETL process, I add a new column to the table with Null values by:output = output.withColumn('Indicator_Latest_Value_Date', F.lit(None))Pipeline works and I don't get any error. But, whe...

Data Engineering

6557 Views
1 replies
0 kudos

12-19-2022 2:09:59 PM

View Replies

Latest Reply

josruiz22
New Contributor III

06-13-2023 5:17:10 AM

0 kudos

Hi,Try converting the None of the output line this :output = output.withColumn('Indicator_Latest_Value_Date', F.lit(None).cast("String"))

0 kudos

06-13-2023 5:17:10 AM

by Yash_542965 • New Contributor II

05-16-2023 9:11:19 AM

8289 Views
2 replies
3 kudos

Resolved! Access Excel file in delta live pipeline

I'm having an issue accessing the excel through dlt pipeline. the file is in ADLS I'm using pandas to read the Excel. It seems pandas are not able to understand abfss protocol is there any way to read Excel with pandas in dlt pipeline?I'm getting thi...

Data Engineering

8289 Views
2 replies
3 kudos

05-16-2023 9:11:19 AM

View Replies

Latest Reply

Yash_542965
New Contributor II

06-09-2023 12:16:13 AM

3 kudos

Thanks for the info. It works just need to install an additional library using "%pip install openpyxl".

3 kudos

06-09-2023 12:16:13 AM

1 More Replies

by Neha_1688 • New Contributor II

06-06-2023 8:15:38 AM

2364 Views
2 replies
3 kudos

Resolved! DLT pipeline that reads data from JDBC source

Could you please guide on how to create the DLT pipeline that directly reads the data from jdbc.When I created the DLT pipeline it give me error at Setting up table, If I ran interactively in notebooks it run successfully, but in non interactive mode...

Data Engineering

2364 Views
2 replies
3 kudos

06-06-2023 8:15:38 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

06-07-2023 12:06:37 AM

3 kudos

What you try do to is not possible.dlt uses autoloader, not jdbcno jars (dlt is sql/python only)I'd skip DLT for this scenario and use an ordinary notebook, nothing wrong with that.

3 kudos

06-07-2023 12:06:37 AM

1 More Replies

by qwerty1 • Contributor

04-24-2023 11:56:35 PM

7397 Views
3 replies
1 kudos

Is there a way to register a scala function that is available to other notebooks?

I am in a situation where I have a notebook that runs in a pipeline that creates a "live streaming table". So, I cannot use a language other than sql in the pipeline. I would like to format a certain column in the pipeline using a scala code (it's a ...

Data Engineering

7397 Views
3 replies
1 kudos

04-24-2023 11:56:35 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

04-25-2023 7:36:09 AM

1 kudos

no, DLT does not work with Scala unfortunately.Delta Live Tables are not vanilla spark.Is python an option instead of scala?

1 kudos

04-25-2023 7:36:09 AM

2 More Replies

by sika • New Contributor II

04-24-2023 5:10:58 AM

11414 Views
2 replies
0 kudos

ignoreDeletes in DLT pipeline

Hi all,I have a DLT pipeline as so:raw -> cleansed (SCD2) -> curated. 'Raw' is utilizing autoloader, to continously read file from a datalake. These files can contain tons of duplicate, which causes our raw table to become quite large. Therefore, we ...

Data Engineering

11414 Views
2 replies
0 kudos

04-24-2023 5:10:58 AM

View Replies

Latest Reply

sika
New Contributor II

04-25-2023 12:04:48 PM

0 kudos

Ok, i'll try an add additional details. Firstly: The diagram below shows our current dataflow: Our raw table is defined as such: TABLES = ['table1','table2'] def generate_tables(table_name): @dlt.table( name=f'raw_{table_name}', table_pro...

0 kudos

04-25-2023 12:04:48 PM

1 More Replies

by GuMart • New Contributor III

04-13-2023 12:31:12 AM

2512 Views
2 replies
1 kudos

Delta Live Tables - RETRY_ON_FAILURE

Hi,Is it possible to set it up the RETRY_ON_FAILURE property for DLTs through the API?I'm not finding in the Docs (although it seems to exist in a response payload).https://docs.databricks.com/delta-live-tables/api-guide.html

Data Engineering

2512 Views
2 replies
1 kudos

04-13-2023 12:31:12 AM

View Replies

Latest Reply

GuMart
New Contributor III

04-16-2023 10:58:19 PM

1 kudos

Hi @Suteja Kanuri ,Thank you so much for the quick and complete answer!Regards,

1 kudos

04-16-2023 10:58:19 PM

1 More Replies

by EDDatabricks • Contributor

03-16-2023 1:49:18 AM

2560 Views
2 replies
3 kudos

DLT pipeline slow streaming (root cause needs to be identified)

Dear support,we have the following situation where a set of DLT pipelines are streaming with very low rate incoming data and we need to find the root cause of this delay.In order to provide more insight about the setup of the DLT pipelines and some m...

Data Engineering

2560 Views
2 replies
3 kudos

03-16-2023 1:49:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-25-2023 3:45:07 AM

3 kudos

Hi @EDDatabricks EDDatabricks Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that ...

3 kudos

03-25-2023 3:45:07 AM

1 More Replies