topic Re: Adding if statements or try/catch block in sql based dlt pipelines in Data Engineering

Adding if statements or try/catch block in sql based dlt pipelines

ashraf1395 — Wed, 29 Jan 2025 11:27:17 GMT

We have complete sql based dlt pipelines. Where bronze tables are read from UC volumes. There can be situations when no new data comes so of the endpoints in the UC volume. in that case the SQL code blocks gets failed which results in failing the entire pipelines. Where ideal case should be of whichever endpoints in the UC volume have new data those SQL blocks should run correctly which those where no new file got added. They should haves some if else or try catch block blocks to handle this scenario. How to achieve it

This is how right now for example two code blocks of my SQL based dlt pipeline looks

code block 1

CREATE OR REFRESH MATERIALIZED VIEW jamf_inventory_information COMMENT "Incremental upload of data incoming from source endpoint 1" AS SELECT * FROM read_files( '/Volumes/catalog/${schema_path}/catalog_name-${schema_path}/source=source/*/table_name=endpoint/*/*.json' );

code block 2

CREATE OR REFRESH MATERIALIZED VIEW jamf_mobile_devices COMMENT "Incremental upload of data incoming from Jamf for endpoint 2" AS SELECT * FROM read_files( '/Volumes/catalog/${schema_path}/catalog_name-${schema_path}/source=source/*/table_name=endpint_2/*/*.json' );

code block 1 with endpoint_1 got new file or it has the files then it runs correctly.

while code block 2 with endpoint_2 has no file or no folder. So idealy that code block should be skipped but if it runs it will fail and raise error.

Re: Adding if statements or try/catch block in sql based dlt pipelines

VZLA — Wed, 29 Jan 2025 13:31:05 GMT

Hello, thank you for your question.

Since SQL-based Delta Live Tables (DLT) pipelines do not natively support IF-ELSE or TRY-CATCH constructs, you'll need an approach to gracefully handle missing files. Here are two recommended solutions:

Solution 1: Use `read_files` with `allow_missing_files = true`

The read_files function supports an allow_missing_files option that prevents failure if the path is empty or missing. Modify your queries as follows:

CREATE OR REFRESH MATERIALIZED VIEW jamf_mobile_devices COMMENT "Incremental upload of data incoming from endpoint_2" AS SELECT * FROM read_files( '/Volumes/catalog/${schema_path}/catalog_name-${schema_path}/source=source/*/table_name=endpoint_2/*/*.json', allow_missing_files => true -- Prevents failure when no files exist );

This ensures that if no new files exist, the query does not fail and instead returns an empty dataset.

Solution 2: Use `WHERE` with a Metadata Table (Advanced)

Maintain a metadata table tracking available file paths.
Before querying, filter out missing sources dynamically.

WITH available_sources AS ( SELECT path FROM file_metadata_table WHERE table_name = 'endpoint_2' ) CREATE OR REFRESH MATERIALIZED VIEW jamf_mobile_devices AS SELECT * FROM read_files( (SELECT path FROM available_sources) );

This prevents querying non-existent sources.

If allow_missing_files => true works in your case, it is the easiest solution. Otherwise, dynamically maintaining a metadata table for available file paths is a more robust alternative. Let me know if you need more details!

Re: Adding if statements or try/catch block in sql based dlt pipelines

Rjdudley — Wed, 29 Jan 2025 13:56:51 GMT

Is there a reason why you can't use Autoloader for this? That would only trigger the pipeline when new files arrive.

Re: Adding if statements or try/catch block in sql based dlt pipelines

Sidhant07 — Thu, 30 Jan 2025 07:41:15 GMT

Yes, using autoloader with file notification mode can be useful here.

Also you can use the IF EXISTS clause to check if the files exist before attempting to create or refresh the materialized view. This will prevent the SQL block from running if there are no new files.

topic Re: Adding if statements or try/catch block in sql based dlt pipelines in Data Engineering

Adding if statements or try/catch block in sql based dlt pipelines

Re: Adding if statements or try/catch block in sql based dlt pipelines

Solution 1: Use read_files with allow_missing_files = true

Solution 2: Use WHERE with a Metadata Table (Advanced)

Re: Adding if statements or try/catch block in sql based dlt pipelines

Re: Adding if statements or try/catch block in sql based dlt pipelines

Solution 1: Use `read_files` with `allow_missing_files = true`

Solution 2: Use `WHERE` with a Metadata Table (Advanced)