Databricks Community

xiangzhu · ‎11-21-2022

Hello,

I've read the posts:

Jobs - Delta Live tables difference (databricks.com)

and

Difference between Delta Live Tables and Multitask Jobs (databricks.com)

My understanding is that delta live tables are more like a DSL that simplfies the workflow definition (json instead of code).

Could you please confirme jobs can do everthing that delta live tables do, but not vice versa ?

LandanG · ‎11-21-2022

Hi @Xiang ZHU ,

DLT is a declarative way (either SQL or Python) to build data pipelines in Databricks that uses Delta tables for each stage in the pipeline and has many features and benefits that running ETL pipelines in a notebook might not have. Jobs are a way to orchestrate tasks in Databricks that may include DLT pipelines and much more.

So while you can use jobs to schedule a DLT pipeline, they don't replace each other. Jobs won't be able to do what DLT does and DLT won't be able to do what Jobs does.

Jobs docs: https://docs.databricks.com/workflows/jobs/jobs.html

DLT docs: https://docs.databricks.com/workflows/delta-live-tables/index.html

xiangzhu · ‎11-21-2022

@Landan George

"Jobs won't be able to do what DLT does",

I read some blogs, and watched some videos too, but I still cannot figure out the difference between jobs vs DLT. Does it mean without Databricks DLT, Databricks jobs cannot handle delta tables ?

Could you please spotlight concretly what DLT can do but jobs can't ? Just some of them is enough.

LandanG · ‎11-21-2022

@Xiang ZHU

From the docs above:

Delta Live Tables is a framework for building reliable, maintainable, and testable data processing pipelines. You define the transformations to perform on your data, and Delta Live Tables manages task orchestration, cluster management, monitoring, data quality, and error handling.

Instead of defining your data pipelines using a series of separate Apache Spark tasks, Delta Live Tables manages how your data is transformed based on a target schema you define for each processing step. You can also enforce data quality with Delta Live Tables expectations. Expectations allow you to define expected data quality and specify how to handle records that fail those expectations.

A job is a way to run non-interactive code in a Databricks cluster. For example, you can run an extract, transform, and load (ETL) workload interactively or on a schedule. You can also run jobs interactively in the notebook UI.

Your job can consist of a single task or can be a large, multi-task workflow with complex dependencies. Databricks manages the task orchestration, cluster management, monitoring, and error reporting for all of your jobs. You can run your jobs immediately or periodically through an easy-to-use scheduling system.

Databricks Community

Could jobs do everything delta live tables do ?

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!