Hi there,I'm curious if anyone is able to definitively help me answer how DLT Job Clusters operate/run.For example, the following is my baseline understanding of DLT Job Clusters. If I run a Triggered DLT Pipeline (e.g. daily) the job cluster takes m...
Hey there! Thanks a bunch for being part of our awesome community! We love having you around and appreciate all your questions. Take a moment to check out the responses – you'll find some great info. Your input is valuable, so pick the best solution...
I'm trying to figure out what's the best way to "de-duplicate" data via DLT. Currently, my only leads are:Manage data quality with Delta Live Tables | Databricks on AWSVia "Drop invalid records"Constraints on Databricks | Databricks on AWSVia "pre-de...
Hey @ChristianRRL ,Based on my understanding you want to de-duplicate your data during your DLT pipeline processing unfortunately I was not able to find a solution to this when I ran into this problem due to the native feature limitations.Limitations...
Super basic question. For DLT pipelines I see there's an option to add multiple "Paths". Is it generally best practice to completely separate `bronze` from `silver` notebooks? Or is it more recommended to bundle both raw `bronze` and clean `silver` d...
Hi there, I'm wondering if someone can help me understand what compute resources DLT uses? It's not clear to me at all if it uses the last compute cluster I had been working on, or something else entirely.Can someone please help clarify this?
Well, one thing they emphasize in the 'Adavanced Data Engineer' Training is that job-clusters will terminate within 5 minutes after a job is completed. So this could be in support of your theory to lower costs. I think job-cluster are actually design...