cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to prevent escaping tables updated infrequently from the Unity Catalog Data Lineage?

ntlbdv
New Contributor III

Using Unity Catalog as a unified metastore for Databricks we are able to track the data lineage of tables.

The lineage is going to be maintained for 30 days - this is described in the official documentation:

- Because lineage is computed on a 30-day rolling window, lineage is not displayed for tables that have not been modified within the last 30 days.

If a table is not updated for 30 days, this means the data lineage will no longer be visible for that specific table. The lineage will become visible again once the table gets updated. 

I try to find the possibility to avoid this limitation for use cases that need longer retention (e.g. quarterly or annual reporting).

What I have tried already:

I checked how to change 'updated at' in UC after OPTIMIZE operation and got the following result:

- The operation creates a new version of the table in the table history

- However 'updated at' at the 'Details' tab in Unity Catalog does not change

I think it is related with: optimize and similar operations change the structure of files but not the data.

The best solution for these specific cases would be using the API for gathering data and visualization inside of Unity Catalog. However, Unity Catalog doesn’t allow to insert the lineage into the Unity Catalog manually. 

2 REPLIES 2

Aviral-Bhardwaj
Esteemed Contributor III

This is really interesting , I have to explore this more

AviralBhardwaj

NOOR_BASHASHAIK
Contributor

@Natalia Lebedeva​ did you discover any other possible workaround?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group