cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

DLT - how to log number of rows read and written

Faisal
Contributor

Hi @Kaniz_Fatma - how to log number of rows read and written in dlt pipeline, I want to store it in audit tables post the pipeline update completes. Can you give me sample query code ?

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @Faisal, Certainly! To log the number of rows read and written in a DLT (Data Lake Table) pipeline and store it in an audit table after the pipeline update completes, you can follow these steps:

 

Capture Metrics in Azure Data Factory (ADF):

  • In your pipeline, after the data copy activity, add an additional activity (e.g., a Set Variable activity).
  • Use the output JSON of the Copy Activity to extract the relevant metrics. Specifically, you can access the rowsRead and rowsCopied properties from the output.
  • Store these values in a variable.

Invoke a Stored Procedure:

  • Create a stored procedure in your SQL Server database that accepts the row counts as input parameters.
  • In your pipeline, use an Execute Stored Procedure activity to invoke this stored procedure.
  • Pass the rowsRead and rowsCopied values as parameters to the stored procedure.
  • The stored procedure will insert these values into your metadata table.

Example Query Code:

Create a stored procedure to insert row counts into the metadata table

---- Create a stored procedure to insert row counts into the metadata table

  CREATE PROCEDURE dbo.InsertRowCounts    @sourceRowCount INT,    @sinkRowCount INT AS BEGIN    INSERT INTO dbo.AuditMetadata (SourceRowCount, SinkRowCount, Timestamp)    VALUES (@sourceRowCount, @sinkRowCount, GETDATE()); END; 

-- In your pipeline, use the Execute Stored Procedure activity 

EXEC dbo.InsertRowCounts    @sourceRowCount = @variables('sourceRowCount'),    @sinkRowCount = @variables('sinkRowCount');

Note: Replace dbo.AuditMetadata with your actual metadata table name.

 

Handle Errors:

  • Ensure proper error handling in case the stored procedure fails to execute.
  • You can use Try-Catch blocks or other mechanisms to handle exceptions.

Remember to adjust the query code according to your specific database schema and naming conventions. If you encounter any issues, feel free to ask for further assistance! ๐Ÿš€

Faisal
Contributor

Thanks @Kaniz_Fatma but I asked on how to log number of rows/written via a delta live table (DLT) pipeline, not a delta lake table and the solution you gave is related to data factory pipeline which is not what I need.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group