Databricks Community

yuinagam · ‎07-23-2025

I have a dlt/lakeflow pipeline that creates a table, and I need to make sure that it will only update the resulting materialized view if it will have more than one million records.

I've found this, but it seems to only work if I have already updated the table that I want to validate and want to validate it after with a separate job. this wouldn't work for me because I need to ensure that at no point the table will have too few rows. when I tried it with a single pipeline (creating a temporary version of the table, verifying that temporary table, and if the test passed creating the final table) I encountered a problem where `dlt.read("table_name").count()` always equals zero, even if when the table is created I can count it's rows and get more.

I've also tried just using `count(1)` in the `dlt.expect_or_fail` decorator but that always results in an error and doesn't seem to be supported.

In general the question would be how can I verify conditions that involve aggregation over the data in a dlt pipeline, and only apply the update if the verification succeeded?

mariadawson · ‎07-23-2025

Currently, DLT doesn’t natively support applying expectations or conditional logic based on aggregate metrics like row count within a single pipeline step. That’s why `dlt.expect_or_fail` and trying to count rows within DLT tables doesn’t work as expected.

yuinagam · ‎07-23-2025

Thank you for the quick reply.

Is there a common/recommended/possible way to work around this limitation? I don't mind not using the expectation api if it doesn't support logic that's based on aggregations.

Databricks Community

how can I verify that the result of a dlt will have enough rows before updating the table?

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! October 31 – November 06, 2025

Free Edition Hackathon

🚀 Announcing the Databricks Data Intelligence Platform Cheat Sheet

Zerobus Ingest in Action: How to Stream Event Data Directly into Your Lakehouse

Find Sensitive Data at Scale with Data Classification in Unity Catalog