cancel
Showing results for 
Search instead for 
Did you mean: 

Pull query that inserts into table

cmilligan
Contributor II

I'm trying to pull some data down for table history and am needing to view the query that inserted into a table. My team owns the process so I'm able to view the current query by just viewing it but I'm also wanting to capture changes over time without having to store off different versions.

Is there a way to programmatically pull the query that inserted into a table?

1 REPLY 1

Anonymous
Not applicable

@Coleman Milligan​ :

Yes, in Databricks, you can use the built-in Delta Lake feature to track the history of changes made to a table, including the queries that inserted data into it.

Here's an example of how to retrieve the queries that inserted data into a Delta table using Databricks SQL:

SELECT operationMetrics.`write`.`txnId`, input_file_name(), operationMetrics.`write`.`query` AS insert_query
FROM table_name
WHERE operation = 'WRITE'
ORDER BY operationMetrics.`write`.`txnId` DESC

This query retrieves the transaction ID, file name, and insert query for each write operation on the specified table. The input_file_name() function returns the name of the file containing the data for the write operation. Note that the above query assumes that you have enabled Delta Lake versioning on your table. If you haven't, you can enable it using the following command:

ALTER TABLE table_name SET TBLPROPERTIES ('delta.enableChangeDataCapture' = 'true');

Once versioning is enabled, Delta Lake automatically tracks changes to the table, including insert queries, and stores them in its metadata.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.