UnityCatalog
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2023 11:27 AM
Can we get the lineage for onprem spark application running i. Loval
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2023 12:02 PM
Yes, you can create private endpoint links to connect with your on-premise systems and visualize this from the unity catalog like a unified source.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2023 12:26 PM
Yes, you can create private endpoint links and application access (depends on your cloud provider) to connect with your on-premise systems and unity catalog can sync that well. Lineage is supported for all languages and is captured down to the column level. Lineage data includes notebooks, workflows, and dashboards related to the query. Lineage can be visualized in Data Explorer.
For Azure the below link is really useful:
https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/data-lineage
Requirements
The following are required to capture data lineage with Unity Catalog:
- The workspace must have Unity Catalog enabled and be launched in the Premium tier.
- Tables must be registered in a Unity Catalog metastore to be eligible for lineage capture.
- Queries must use the Spark DataFrame (for example, Spark SQL functions that return a DataFrame) or Databricks SQL interfaces. For examples of Databricks SQL and PySpark queries, see Examples.
- To view the lineage of a table or view, users must have the SELECT privilege on the table or view.
- To view lineage information for notebooks, workflows, or dashboards, users must have permissions on these objects as defined by the access control settings in the workspace. See Lineage permissions.