โ12-11-2024 11:09 PM
how can we process multiple tables within a delta live table pipeline parallelly as table names as parameters.
โ12-12-2024 04:54 AM
To process multiple tables within a Delta Live Table (DLT) pipeline in parallel using table names as parameters, you can leverage the flexibility of the DLT Python API. Hereโs a step-by-step guide on how to achieve this:
Hereโs an example of how you can define multiple tables dynamically:
import dlt
from pyspark.sql.functions import col
# Function to create a table
def create_table(table_name):
@Dlt.table(name=table_name)
def table_def():
return spark.read.table(f"source_database.{table_name}")
# List of table names to process
table_names = ["table1", "table2", "table3"]
# Create tables dynamically
for table_name in table_names:
create_table(table_name)
โ12-15-2024 08:49 PM
if we use for loop to pass table names, it will be handled one by one, right?
if yes, can you suggest any other methods like I need to process 'n' number of tables at a time .
โ12-17-2024 11:28 PM
can we run a dlt pipeline multiple time at the same time using different parameters using rest api call with asyncio.
i have created a function to start the pipeline using rest api.
when calling the function with asyncio , i am getting [409 Conflict]> error.
โ02-25-2025 09:33 AM
@Alberto_Umana where you're ingesting the list "table_names = ["table1", "table2", "table3"]", can I replace this with the row values from a DLT view?
When I've tried using the @dlt.view, I run into the error that I need to iterate within the confines of a dlt structure and if I use the rows from a @dlt.table then I run into a "table not found" error which I think is a limitation on how DLT sets up the DAG/relationships before actual processing?
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now