cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Table pipeline steps explanation

ac0
Contributor

Does anyone have documentation on what is actually occurring in each of these steps?

Creating update
 
Waiting for resources
Initializing
Setting up tables
Rendering graph

For example, what is the difference between initializing and setting up tables? I am trying find out what exactly is happening in each of these.
3 REPLIES 3

Mounika_Tarigop
Databricks Employee
Databricks Employee

Please find the steps happening in the Delta Live Table (DLT) pipeline: 

  1. Creating updates: This step involves preparing the necessary updates to the pipeline. It includes determining the changes that need to be applied to the data tables based on the defined transformations and data flow.

  2. Waiting for resources: During this step, the pipeline is waiting for the required computational resources to be allocated. This ensures that there are enough resources available to execute the pipeline efficiently.

  3. Initializing: Initialization involves setting up the environment for the pipeline execution. This includes loading configurations, preparing the execution context, and ensuring that all dependencies are in place.

  4. Setting up tables: This step involves creating or refreshing the tables defined in the pipeline. It includes setting up the schema, applying any necessary constraints, and preparing the tables for data ingestion and transformation.

  5. Rendering graph: In this step, the pipeline generates a visual representation of the data flow and transformations. This graph helps in understanding the dependencies and the sequence of operations within the pipeline.

The difference between "initializing" and "setting up tables" is that initializing focuses on preparing the overall execution environment, while setting up tables specifically deals with creating and configuring the tables that will be used in the pipeline.

https://www.databricks.com/discover/pages/getting-started-with-delta-live-tables

kapil-dua
New Contributor II

@Mounika_Tarigop Please explain:
Loading data (full refresh/refresh) into all Streaming tables and refreshing Materialized views are part of the Setting up table step in a DLT trigger mode ?

Mounika_Tarigop
Databricks Employee
Databricks Employee

Yes, loading data (full refresh/refresh) into all streaming tables and refreshing materialized views are part of the "Setting up table" step in a Delta Live Tables (DLT) pipeline when running in trigger mode.

In triggered mode, materialized views are fully recomputed every time the pipeline is executed. For streaming tables, a full refresh will truncate the table and process all data available in the source with the latest definition of the streaming table. This ensures that the tables and views reflect the current state of their input data sources

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group