topic How to do a Full Load using DLT pipeline in Data Engineering

How to do a Full Load using DLT pipeline

ImranA — Wed, 12 Feb 2025 20:38:54 GMT

if I use "spark.readStream" it does incremental loads and If I do "spark.read" it creates a materialised view.

What I want is: do a full load each time(no need of scd types) and it should be a streaming table and not a materialised view.

Any help would be appreciable.

Re: How to do a Full Load using DLT pipeline

AmanSehgal — Thu, 13 Feb 2025 03:06:09 GMT

In Databricks Delta Live Tables (DLT), you can't directly truncate a streaming table, as streaming tables are append-only by design.

However in your scenario, you could possibly use a job workflow, where the first task runs a sql statement (using serverless sql) to truncate the table and the following job runs your DLT pipeline.