Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
The title is self-explanatory, but I'm curious if anyone has any inclination as to when Materialized Views will enter General Availability? I'm surprised it's not already in GA considering (A) it's been a year or more since Databricks originally anno...
Hi there,I'm curious if anyone is able to definitively help me answer how DLT Job Clusters operate/run.For example, the following is my baseline understanding of DLT Job Clusters. If I run a Triggered DLT Pipeline (e.g. daily) the job cluster takes m...
Ideally one would expect clusters used for DLT pipeline to terminate after the pipeline execution has finished. However, while running in `development` environment, you'll notice it doesn't terminate on its own, whereas in `production` it terminates ...
I'm trying to figure out what's the best way to "de-duplicate" data via DLT. Currently, my only leads are:Manage data quality with Delta Live Tables | Databricks on AWSVia "Drop invalid records"Constraints on Databricks | Databricks on AWSVia "pre-de...
Hey @ChristianRRL ,Based on my understanding you want to de-duplicate your data during your DLT pipeline processing unfortunately I was not able to find a solution to this when I ran into this problem due to the native feature limitations.Limitations...
Super basic question. For DLT pipelines I see there's an option to add multiple "Paths". Is it generally best practice to completely separate `bronze` from `silver` notebooks? Or is it more recommended to bundle both raw `bronze` and clean `silver` d...
Hi there, I'm wondering if someone can help me understand what compute resources DLT uses? It's not clear to me at all if it uses the last compute cluster I had been working on, or something else entirely.Can someone please help clarify this?
Well, one thing they emphasize in the 'Adavanced Data Engineer' Training is that job-clusters will terminate within 5 minutes after a job is completed. So this could be in support of your theory to lower costs. I think job-cluster are actually design...