cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

John_BardessGro
by New Contributor II
  • 4832 Views
  • 3 replies
  • 4 kudos

Cluster Reuse for delta live tables

I have several delta live table notebooks that are tied to different delta live table jobs so that I can use multiple target schema names. I know it's possible to reuse a cluster for job segments but is it possible for these delta live table jobs (w...

  • 4832 Views
  • 3 replies
  • 4 kudos
Latest Reply
Kaniz
Community Manager
  • 4 kudos

Hi @John Fico​ â€‹, We haven’t heard from you since the last response from @Hubert Dudek​ and @Jose Gonzalez​ , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpf...

  • 4 kudos
2 More Replies
User16835756816
by Valued Contributor
  • 2635 Views
  • 1 replies
  • 6 kudos

How can I simplify my data ingestion by processing the data as it arrives in cloud storage?

This post will help you simplify your data ingestion by utilizing Auto Loader, Delta Optimized Writes, Delta Write Jobs, and Delta Live Tables. Pre-Req: You are using JSON data and Delta Writes commandsStep 1: Simplify ingestion with Auto Loader Delt...

  • 2635 Views
  • 1 replies
  • 6 kudos
Latest Reply
youssefmrini
Honored Contributor III
  • 6 kudos

This post will help you simplify your data ingestion by utilizing Auto Loader, Delta Optimized Writes, Delta Write Jobs, and Delta Live Tables.Pre-Req: You are using JSON data and Delta Writes commandsStep 1: Simplify ingestion with Auto Loader Delta...

  • 6 kudos
kfoster
by Contributor
  • 950 Views
  • 2 replies
  • 3 kudos

DLT Event Log

I am trying to utilize the Event Log DLT is keeping updated, I noticed some of the fields are consistently empty/null.In the Event Log, located ".../storage/system/events", I see the field "origin" and there are nested fields within which are empty/n...

  • 950 Views
  • 2 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Moderator
  • 3 kudos

Hi @Kristian Foster​,The following docs will provide more details on the event log schema. Please refer to this link https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-event-log.html#monitor-pipelines-with-the-delta-live-tables...

  • 3 kudos
1 More Replies
SamSteere
by New Contributor III
  • 1170 Views
  • 3 replies
  • 6 kudos

docs.databricks.com

REST API Documentation is out of date since the release of Delta Live TablesWhen using the `2.0/clusters/list` endpoint in an environment with running clusters provisioned by DLTs, the clusters will be returned with a `cluster_source` value of `PIPEL...

  • 1170 Views
  • 3 replies
  • 6 kudos
Latest Reply
Vidula
Honored Contributor
  • 6 kudos

Hi @Sam Steere​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 6 kudos
2 More Replies
PrebenOlsen
by New Contributor III
  • 1696 Views
  • 4 replies
  • 1 kudos

GroupBy in delta live tables fails with error "RuntimeError: Query function must return either a Spark or Koalas DataFrame"

I have a delta live table that I'm trying to run GroupBy on, but getting an error: "RuntimeError: Query function must return either a Spark or Koalas DataFrame". Here is my code:@dlt.table def groups_hierarchy():   df = dlt.read_stream("groups_h...

  • 1696 Views
  • 4 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Preben Olsen​ Does @Debayan Mukherjee​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
3 More Replies
BenLambert
by Contributor
  • 1260 Views
  • 2 replies
  • 2 kudos

Resolved! Delta Live Tables not inferring table schema properly.

I have a delta live tables pipeline that is loading and transforming data. Currently I am having a problem that the schema inferred by DLT does not match the actual schema of the table. The table is generated via a groupby.pivot operation as follows:...

  • 1260 Views
  • 2 replies
  • 2 kudos
Latest Reply
BenLambert
Contributor
  • 2 kudos

I was able to get around this by specifying the table schema in the table decorator.

  • 2 kudos
1 More Replies
osoucy
by New Contributor II
  • 589 Views
  • 0 replies
  • 1 kudos

Is it possible to join two aggregated streams of data?

ObjectiveWithin the context of a delta live table, I'm trying to merge two streams aggregation, but run into challenges. Is it possible to achieve such a join?ContextAssume- table trades stores a list of trades with their associated time stamps- tabl...

  • 589 Views
  • 0 replies
  • 1 kudos
vjraitila
by New Contributor III
  • 1193 Views
  • 3 replies
  • 5 kudos

Strategy for streaming ETL and Delta Lake before Delta Live Tables existed

What was the established architectural pattern for doing streaming ETL with Delta Lake before DLT was a thing? And incidentally, what approach would you take in the context of delta-oss today? The pipeline definitions would not have had to be declara...

  • 1193 Views
  • 3 replies
  • 5 kudos
Latest Reply
Vidula
Honored Contributor
  • 5 kudos

Hi @Veli-Jussi Raitila​ Does @Shanmugavel Chandrakasu​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 5 kudos
2 More Replies
pmt
by New Contributor III
  • 2005 Views
  • 7 replies
  • 1 kudos

Handling Changing Schema in CDC DLT

We are building a DLT pipeline and the autoloader is handling schema evolution fine. However, further down the pipeline we are trying to load that streamed data with the apply_changes() function into a new table and, from the looks of it, doesn't see...

  • 2005 Views
  • 7 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hey there @Palani Thangaraj​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear fro...

  • 1 kudos
6 More Replies
PrebenOlsen
by New Contributor III
  • 1130 Views
  • 1 replies
  • 1 kudos

Resolved! Why does @dlt.table from a table give different results than from a view?

I have some data in silver that I read in as a view using the __apply_changes function on. I create a table based on this, and I then want to create my gold-table, after doing a .groupBy() and .pivot(). The transformations I do in the gold-table aren...

image image
  • 1130 Views
  • 1 replies
  • 1 kudos
Latest Reply
PrebenOlsen
New Contributor III
  • 1 kudos

I have found a temporary solution to solve this. The .pivot("columnName") should automatically grab all the values it can find, but for some reason it does not. I need to specify the values, using.pivot("group_name", "group0", "group1", "group2"...) ...

  • 1 kudos
Erik
by Valued Contributor II
  • 2350 Views
  • 1 replies
  • 3 kudos

Resolved! How to combine medallion architecture and delta live-tables nicely?

As many of you, we have implemented a "medallion architecture" (raw/bronze/silver/gold layers), which are each stored on seperate storrage accounts. We only create proper hive tables of the gold layer tables, so our powerbi users connecting to the da...

  • 2350 Views
  • 1 replies
  • 3 kudos
Latest Reply
merca
Valued Contributor II
  • 3 kudos

I can answer the first question:You can define data storage by setting the `path` parameter for tables. The "storage path" in pipeline settings will then only hold checkpoints (and some other pipeline stuff) and data will be stored in the correct acc...

  • 3 kudos
ilarsen
by Contributor
  • 575 Views
  • 0 replies
  • 1 kudos

Trouble referencing a column that has been added by schema evolution (Auto Loader with Delta Live Tables)

Hi,I have a Delta Live Tables pipeline, using Auto Loader, to ingest from JSON files. I need to do some transformations - in this case, converting timestamps. Except one of the timestamp columns does not exist in every file. This is causing the DLT p...

  • 575 Views
  • 0 replies
  • 1 kudos
MadelynM
by New Contributor III
  • 5542 Views
  • 1 replies
  • 0 kudos

Delta Live Tables + S3 | 5 tips for cloud storage with DLT

You’ve gotten familiar with Delta Live Tables (DLT) via the quickstart and getting started guide. Now it’s time to tackle creating a DLT data pipeline for your cloud storage–with one line of code. Here’s how it’ll look when you're starting:CREATE OR ...

Workflows-Left Nav Workflows
  • 5542 Views
  • 1 replies
  • 0 kudos
Latest Reply
MadelynM
New Contributor III
  • 0 kudos

Tip #3: Use JSON cluster configurations to access your storage locationKnowledge check: How do I modify DLT settings using JSON? Delta Live Tables settings are expressed as JSON and can be modified in the Delta Live Tables UI [AWS] [Azure][GCP].Examp...

  • 0 kudos
FD_MR
by New Contributor II
  • 797 Views
  • 0 replies
  • 1 kudos

Delta Live Tables executing repeatedly and returning empty DF

Still relatively new to Spark and even more so to Delta Live Tables so apologies if I've missed something fundamental but here goes.We are trying to run a notebook via Delta Live Tables, which contains 2 functions decorated by the `dlt.table` decorat...

  • 797 Views
  • 0 replies
  • 1 kudos
karthikM
by New Contributor
  • 1102 Views
  • 3 replies
  • 1 kudos

Delta Live Tables

is DLT supported for Scala? Any reference implementations or wikis to get started?

  • 1102 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Karthik Munipalle​, Delta Live Tables queries can be implemented in Python or SQL.Here are few articles best explaining about DLT. Please have a look.https://docs.databricks.com/data-engineering/delta-live-tables/index.htmlhttps://databricks.com/...

  • 1 kudos
2 More Replies
Labels