cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databrick unity catalog REST API documentation link and lineage retention period

sukhendu2017
New Contributor II
Hi Team,
Hope you guys are doing well.
 
Question 1:
I am using this Databrick unity catalog rest api endpoint api/2.0/lineage-tracking/table-lineage?table_name=schemaname.catalogname.tablename&include_entity_lineage=true for getting lineage .
 
Is this officially supported, if so please share the official document link.
 
If not, please share the alternative rest API/SDK approach.
 
Question 2:
Also, I want to get lineage for 6 months, 12 months, 18 months.
I am using this Databrick unity catalog rest Api endpoint api/2.0/lineage-tracking/table-lineage?table_name=schemaname.catalogname.tablename&include_entity_lineage=true&start_timestamp=1725235200000
 
Is this officially supported, if so, please share the official document link.
 
If not, please share the alternative rest API/SDK approach.
1 ACCEPTED SOLUTION

Accepted Solutions

Hi @sukhendu2017 - Thanks for the follow‑up and for sharing the screenshot. This is a great question.

The key point is that the official retention guarantee is what’s documented, not what an internal/UX control happens to allow:

  • The docs for Unity Catalog lineage and system tables describe retention as up to ~1 year. That’s the only behaviour Databricks commits to and supports as a contract.
  • The UI filter (e.g. “Last 18 months” / “All available”) can show more than 12 months in some workspaces/regions if that data happens to exist, but that is not a documented guarantee and can change over time. Think of it as “best‑effort: show whatever data is currently stored”, not “we guarantee 18/24/36 months”.

So there isn’t really a contradiction:

  • Docs = supported retention contract (up to 1 year).
  • UI = whatever lineage is currently available in the backend for your workspace, which may sometimes exceed 12 months but isn’t guaranteed.

 

If older lineage still exists in the backend, the UI may show it under “All available”. But because this is beyond the documented retention window, Databricks can change that behaviour or compact old data without notice. I would not design any governance process that depends on >12 months being present in the UI.

You can always query whatever lineage data is currently stored in the official system tables:

SELECT *
FROM system.access.table_lineage
WHERE destination_table_full_name = '<catalog>.<schema>.<table>'
  AND event_time >= '<start_timestamp>'
However, just like the UI:
  • System tables are only guaranteed to hold data within the documented retention period (≈1 year).
  • If you currently see 18 months of lineage in the UI, you will typically see the same in system.access.table_lineage today.... but that is not a long‑term guarantee.

If you must have a reliable 18, 24, or 36‑month lineage history, the recommended pattern is:

  1. Use system.access.table_lineage (and related tables) as the supported source of lineage.
  2. Set up a scheduled job to periodically export/append lineage rows into your own long‑term Delta tables or warehouse, so you control retention beyond the built‑in window.

Hope this clarifies your question.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

View solution in original post

3 REPLIES 3

Ashwin_DSA
Databricks Employee
Databricks Employee

Hi @sukhendu2017

Here’s how things stand today...

1. Is /api/2.0/lineage-tracking/table-lineage officially supported?

No. The endpoint /api/2.0/lineage-tracking/table-lineage (including parameters like include_entity_lineage and start_timestamp) is an internal, undocumented API used by the Databricks UI. Since it does not appear in the official REST API reference, it is considered unsupported:

  • It is not listed in the Databricks REST API docs
  • Databricks does not provide an SLA or guarantee of stability for undocumented endpoints, and they may change without notice.
  • For production use, you should rely only on the APIs documented in the official reference.
2. Can I get lineage for 6, 12, or 18 months?

Unity Catalog lineage data is retained for a finite period (documented as up to 365 days in the system tables lineage documentation). For longer horizons, you need to archive:

  • Up to ~12 months: Use official system tables for lineage rather than the internal REST endpoint.
  • Beyond 12 months (e.g., 18 months): Not retained by default. You must export/archive lineage yourself (e.g., into your own Delta tables) before it ages out.

My recommendation would be... Instead of calling the internal /lineage-tracking/table-lineage endpoint, use:

  • System table system.access.table_lineage (and related tables) as the supported source of lineage, and
  • The Statement Execution REST API or a Databricks SDK to run SQL queries over those tables and filter for your required 6 or 12 month window.

For 18 month requirements, implement a scheduled job that reads from system.access.table_lineage and writes lineage data to your own long‑term storage before it expires.

Relevant docs as requested:

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

sukhendu2017
New Contributor II

Thank you so much for response and solution in details.

I have one question in Unity Catalog lineage UI showing last 18 months also, but as per official documentation Lineage retains up to last 1 Year. Why this contradicts statement! 
Also, I am trying to understand if one of the tables are having lineage from last 24 or 36 months, is it same will show in UI?

For reference, PFA.

If it is last 18 months, then same can I get table and column lineage through System table system.access.table_lineage (and related tables)?

Hi @sukhendu2017 - Thanks for the follow‑up and for sharing the screenshot. This is a great question.

The key point is that the official retention guarantee is what’s documented, not what an internal/UX control happens to allow:

  • The docs for Unity Catalog lineage and system tables describe retention as up to ~1 year. That’s the only behaviour Databricks commits to and supports as a contract.
  • The UI filter (e.g. “Last 18 months” / “All available”) can show more than 12 months in some workspaces/regions if that data happens to exist, but that is not a documented guarantee and can change over time. Think of it as “best‑effort: show whatever data is currently stored”, not “we guarantee 18/24/36 months”.

So there isn’t really a contradiction:

  • Docs = supported retention contract (up to 1 year).
  • UI = whatever lineage is currently available in the backend for your workspace, which may sometimes exceed 12 months but isn’t guaranteed.

 

If older lineage still exists in the backend, the UI may show it under “All available”. But because this is beyond the documented retention window, Databricks can change that behaviour or compact old data without notice. I would not design any governance process that depends on >12 months being present in the UI.

You can always query whatever lineage data is currently stored in the official system tables:

SELECT *
FROM system.access.table_lineage
WHERE destination_table_full_name = '<catalog>.<schema>.<table>'
  AND event_time >= '<start_timestamp>'
However, just like the UI:
  • System tables are only guaranteed to hold data within the documented retention period (≈1 year).
  • If you currently see 18 months of lineage in the UI, you will typically see the same in system.access.table_lineage today.... but that is not a long‑term guarantee.

If you must have a reliable 18, 24, or 36‑month lineage history, the recommended pattern is:

  1. Use system.access.table_lineage (and related tables) as the supported source of lineage.
  2. Set up a scheduled job to periodically export/append lineage rows into your own long‑term Delta tables or warehouse, so you control retention beyond the built‑in window.

Hope this clarifies your question.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***