cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sarguido
by New Contributor II
  • 3343 Views
  • 5 replies
  • 2 kudos

Delta Live Tables: bulk import of historical data?

Hello! I'm very new to working with Delta Live Tables and I'm having some issues. I'm trying to import a large amount of historical data into DLT. However letting the DLT pipeline run forever doesn't work with the database we're trying to import from...

  • 3343 Views
  • 5 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sarah Guido​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

  • 2 kudos
4 More Replies
daz
by New Contributor III
  • 6404 Views
  • 9 replies
  • 3 kudos

DLT managed by non-existent pipeline

I am building out a new DLT pipeline and have since had to rebuild it from scratch. Having deleted the old pipeline and constructed a new one I now get this error:Table 'X' is already managed by pipeline 'Y'. As I only have the one pipeline how would...

  • 6404 Views
  • 9 replies
  • 3 kudos
Latest Reply
Shinaider777
New Contributor II
  • 3 kudos

rename your function from @Dlt.table, for exemple:@Dlt.table(    comment="exemple",    table_properties={"exemple": "exemple"},    partition_cols=["a", "b", "c"])def modify_this_name():

  • 3 kudos
8 More Replies
SimonXu
by New Contributor II
  • 9650 Views
  • 6 replies
  • 15 kudos

Resolved! Failed to launch pipeline cluster

Hi, there. I encountered an issue when I was trying to create my delta live table pipeline. The error is "DataPlaneException: Failed to launch pipeline cluster 1202-031220-urn0toj0: Could not launch cluster due to cloud provider failures. azure_error...

cluster failed to start usage and quota
  • 9650 Views
  • 6 replies
  • 15 kudos
Latest Reply
arpit
Databricks Employee
  • 15 kudos

@Simon Xu​ I suspect that DLT is trying to grab some machine types that you simply have zero quota for in your Azure account. By default, below machine type gets requested behind the scenes for DLT:AWS: c5.2xlargeAzure: Standard_F8sGCP: e2-standard-8...

  • 15 kudos
5 More Replies
rdobbss
by New Contributor II
  • 4249 Views
  • 3 replies
  • 3 kudos

How to use foreachbatch in deltalivetable or DLT?

I need to process some transformation on incoming data as a batch and want to know if there is way to use foreachbatch option in deltalivetable. I am using autoloader to load json files and then I need to apply foreachbatch and store results into ano...

  • 4249 Views
  • 3 replies
  • 3 kudos
Latest Reply
TomRenish
New Contributor III
  • 3 kudos

Not sure if this will apply to you or not...I was looking at the foreachbatch tool to reduce the workload of getting distinct data from a history table of 20million + records because the df.dropDuplicates() function was intermittently running out of ...

  • 3 kudos
2 More Replies
Chhaya
by New Contributor III
  • 1851 Views
  • 2 replies
  • 2 kudos

DLT config/setting json support

hi team,There used to be option to provide DLT pipeline settings either via UI or JSON, but I do not see it anymore after switching to new UI. Is this something expected ? am I missing something ? here is screenshot for reference.

image
  • 1851 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16752245772
Contributor
  • 2 kudos

Hi @Chhaya Vishwakarma​ This option is available, could you please clear the browser cache and try ? or can you try in an incognito window?

  • 2 kudos
1 More Replies
alemo
by New Contributor III
  • 979 Views
  • 1 replies
  • 1 kudos

DLT started by SERVICE_UPGRADE

HelloI'm developing a dlt pipeline, configured in continuous mode.I'm still in dev mode, so I stop my pipeline when i'm not working on it.My problem is that the pipeline is frequently started by SERVICE_UPGRADE.example of message:'Update xxxxx starte...

  • 979 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @alex mo​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
Rishabh_T
by New Contributor III
  • 4317 Views
  • 4 replies
  • 5 kudos

Resolved! DLT pipeline is unable to process struct with hyphen in nested column name

Hello,I have some nested columns with hyphen i.e. sample-1 in struct column, recently DLT pipeline has started throwing synatx error. Before May 24, 2023, this was working fine.Is this a new bug in May 2023 release?After clearing table and table's da...

Error
  • 4317 Views
  • 4 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Rishabh Tomar​ We haven't heard from you since the last response from @Kaniz Fatma​  . Kindly share the information with us, and in return, we will provide you with the necessary solution. Thanks and Regards

  • 5 kudos
3 More Replies
js54123875
by New Contributor III
  • 2757 Views
  • 3 replies
  • 3 kudos

Setup for Unity Catalog, autoloader, three-level namespace, SCD2

I am trying to setup delta live tables pipelines to ingest data to bronze and silver tables. Bronze and Silver are separate schema. This will be triggered by a daily job. It appears to run fine when set as continuous, but fails when triggered.Table...

  • 2757 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Jennette Shepard​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

  • 3 kudos
2 More Replies
Mado
by Valued Contributor II
  • 5472 Views
  • 1 replies
  • 0 kudos

Resolved! Error when query a table created by DLT pipeline; "Couldn't find value of a column"

Hi, I create a table using DLT pipeline (triggered once). In the ETL process, I add a new column to the table with Null values by:output = output.withColumn('Indicator_Latest_Value_Date', F.lit(None))Pipeline works and I don't get any error. But, whe...

  • 5472 Views
  • 1 replies
  • 0 kudos
Latest Reply
josruiz22
New Contributor III
  • 0 kudos

Hi,Try converting the None of the output line this :output = output.withColumn('Indicator_Latest_Value_Date', F.lit(None).cast("String"))

  • 0 kudos
Yash_542965
by New Contributor II
  • 7961 Views
  • 2 replies
  • 3 kudos

Resolved! Access Excel file in delta live pipeline

I'm having an issue accessing the excel through dlt pipeline. the file is in ADLS I'm using pandas to read the Excel. It seems pandas are not able to understand abfss protocol is there any way to read Excel with pandas in dlt pipeline?I'm getting thi...

  • 7961 Views
  • 2 replies
  • 3 kudos
Latest Reply
Yash_542965
New Contributor II
  • 3 kudos

Thanks for the info. It works just need to install an additional library using "%pip install openpyxl".

  • 3 kudos
1 More Replies
Neha_1688
by New Contributor II
  • 2100 Views
  • 2 replies
  • 3 kudos

Resolved! DLT pipeline that reads data from JDBC source

Could you please guide on how to create the DLT pipeline that directly reads the data from jdbc.When I created the DLT pipeline it give me error at Setting up table, If I ran interactively in notebooks it run successfully, but in non interactive mode...

  • 2100 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

What you try do to is not possible.dlt uses autoloader, not jdbcno jars (dlt is sql/python only)I'd skip DLT for this scenario and use an ordinary notebook, nothing wrong with that.

  • 3 kudos
1 More Replies
qwerty1
by Contributor
  • 6547 Views
  • 3 replies
  • 1 kudos

Is there a way to register a scala function that is available to other notebooks?

I am in a situation where I have a notebook that runs in a pipeline that creates a "live streaming table". So, I cannot use a language other than sql in the pipeline. I would like to format a certain column in the pipeline using a scala code (it's a ...

  • 6547 Views
  • 3 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

no, DLT does not work with Scala unfortunately.Delta Live Tables are not vanilla spark.Is python an option instead of scala?

  • 1 kudos
2 More Replies
sika
by New Contributor II
  • 11051 Views
  • 2 replies
  • 0 kudos

ignoreDeletes in DLT pipeline

Hi all,I have a DLT pipeline as so:raw -> cleansed (SCD2) -> curated. 'Raw' is utilizing autoloader, to continously read file from a datalake. These files can contain tons of duplicate, which causes our raw table to become quite large. Therefore, we ...

  • 11051 Views
  • 2 replies
  • 0 kudos
Latest Reply
sika
New Contributor II
  • 0 kudos

Ok, i'll try an add additional details. Firstly: The diagram below shows our current dataflow: Our raw table is defined as such: TABLES = ['table1','table2']   def generate_tables(table_name): @dlt.table( name=f'raw_{table_name}', table_pro...

  • 0 kudos
1 More Replies
GuMart
by New Contributor III
  • 2168 Views
  • 2 replies
  • 1 kudos

Delta Live Tables - RETRY_ON_FAILURE

Hi,Is it possible to set it up the RETRY_ON_FAILURE property for DLTs through the API?I'm not finding in the Docs (although it seems to exist in a response payload).https://docs.databricks.com/delta-live-tables/api-guide.html

  • 2168 Views
  • 2 replies
  • 1 kudos
Latest Reply
GuMart
New Contributor III
  • 1 kudos

Hi @Suteja Kanuri​ ,Thank you so much for the quick and complete answer!Regards,

  • 1 kudos
1 More Replies
EDDatabricks
by Contributor
  • 1714 Views
  • 2 replies
  • 3 kudos

DLT pipeline slow streaming (root cause needs to be identified)

Dear support,we have the following situation where a set of DLT pipelines are streaming with very low rate incoming data and we need to find the root cause of this delay.In order to provide more insight about the setup of the DLT pipelines and some m...

  • 1714 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @EDDatabricks EDDatabricks​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that ...

  • 3 kudos
1 More Replies
Labels