hi everyone, I am a product manager in Amsterdam. I work on Jobs and Databricks SQL. In my spare time, I cook 菱 and send emojis . Looking forward to working with this community!
I'll try to answer this in the simplest possible way
1. Spark is an imperative programming framework. You tell it what it to do, it does it. DLT is declarative - you describe what you want the datasets to be (i.e. the transforms), and it takes care...
@DBEnthusiast great question! Today, with Job Clusters, you have to specify this. As @btafur note, you do this by setting CPU, memory etc. We are in early preview of Serverless Job Clusters where you no longer specify this configuration, instead Data...
@ankris can you describe your data pipeline a bit? If you are writing in Delta Live Tables (my recommendation) then you can express data quality checks "in flight" as the pipeline processes data. You can do post-ETL data quality checks (e.g. at the a...