cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Lakeflow SDP expectations

IM_01
Contributor II

Hi,

 

Is there a way to get number of warned records, dropped records , failed records for each expectation I see currently it gives aggregated count

1 REPLY 1

Ashwin_DSA
Databricks Employee
Databricks Employee

Hi @IM_01,

You canโ€™t change the UI to break out those numbers, but you can get per-expectation counts from the DLT (Lakeflow) event log. Each expectation entry records passed_records and failed_records; for EXPECT rules failed_records = warned rows, and for EXPECT โ€ฆ DROP ROW rules failed_records = dropped rows. Expectations configured with FAIL UPDATE donโ€™t emit aggregate metrics.

Here is a sample query you can run. Just replace the DLT table name where it says my_dlt_table

WITH exploded AS (
  SELECT
    timestamp,
    explode(
      from_json(
        details:flow_progress:data_quality:expectations,
        'array<struct<name:string,dataset:string,passed_records:long,failed_records:long>>'
      )
    ) AS e
  FROM event_log(TABLE(my_dlt_table))
  WHERE details:flow_progress:data_quality IS NOT NULL
)
SELECT
  timestamp,
  e.name           AS expectation_name,
  e.dataset,
  e.passed_records,
  e.failed_records
FROM exploded
ORDER BY timestamp DESC, expectation_name;

I tested it for a sample table and it returned the split. I'm guessing this is what you want to see?

 

DLT_Expectations.png

You can also take a look at the documentation here for exploring data quality / expectations metrics from the event log.

Hope this helps.

If this answer resolves your question, could you mark it as โ€œAccept as Solutionโ€? That helps other users quickly find the correct fix.

 

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***