cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Is it possible to create/update non dlt table in init phase of dlt task?

PassionateDBD
New Contributor II

We have a dlt task that is written in python. Is it possible to create or update a delta table programatically from inside a dlt task? The delta table would not be managed from inside the dlt task because we never want to fully refresh that table. The table is more of a "logging table" where we only append configurations when they change over time. It would be fine if we are able to update the table once every time that we start the dlt task for example.

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @PassionateDBDYou can programmatically create or update a Delta table from within a Delta Live Tables (DLT) task using Python. DLT provides several options for managing tables and views, including starting pipeline updates, validating updates, and scheduling pipelines as jobs.

Let’s explore how you can achieve this:

  1. Starting a Pipeline Update:

    • When you’re ready to run a pipeline, you can start an update. A pipeline update does the following:
      • Starts a cluster with the correct configuration.
      • Discovers all the tables and views defined in your pipeline.
      • Checks for analysis errors (such as invalid column names, missing dependencies, and syntax errors).
      • Creates or updates tables and views with the most recent data available.
    • You can start a pipeline update via:
  2. Update Types:

    • Depending on your requirements, you can choose different update types:
      • Refresh All: Updates all live tables to reflect the current state of their input data sources. For streaming tables, new rows are appended.
      • Full Refresh All: Updates all live tables by clearing existing data and loading all data from the streaming source.
      • Refresh Selection: Similar to refresh all, but allows you to refresh only selected tables.
      • Full Refresh Selection: Similar to full refresh all, but for selected tables.
    • You can tailor the update behavior based on your use case1.
  3. Dynamic Table Generation:

    • If you want to dynamically generate tables in Databricks using Python and DLT, you can leverage the Python APIs.
    • For example, you can conditionally create a DLT table inside an if block:
      import pandas as pd
      import dlt
      
      def create_logging_table(**kwargs):
          if kwargs.get("df_tableoperation", None) is not None:
              @dlt.table(name="config_changes", comment="Logging table for configuration changes")
              def config_changes_table():
                  return kwargs["df_tableoperation"]
      
    • This allows you to create or update the “logging table” when specific conditions are met3.

Remember that DLT provides flexibility, and you can tailor your approach based on your specific use case. Whether it’s starting pipeline updates, dynamically generating tables, or managing your logging table, DLT’s Python APIs and features have got you covered! 🚀🐍

 

PassionateDBD
New Contributor II

Thanks for you reply @Kaniz ! I'm aware of the possibility to create or not create a table based on some parameter.

What I'm trying to figure out is basically how to achieve following:

-DLT pipeline starts and logs some information to a delta table.

-On each DLT pipeline restart a new row should be appended to the same delta table.

-If I run a full refresh of the DLT pipeline then the delta table where the data has been saved on each startup should remain untouched leaving "the log" of things that were saved during DLT pipeline startups untouched.

The delta table may be a DLT table or a non DLT delta table but the functionality should be the one that I described.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.