cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Why is my lineage extraction not showing up in the Unity Catalog

stephansmit
New Contributor III

Im trying to get the lineage graph to work in Unity catalog, however nothing seems to appear even though I followed the docs. I did the following steps

1. Created a Unity metastore and attached the workspace to that metastore.

2. Created a Single user Spark 11.2 cluster within the linked workspace with the following option:

spark.databricks.dataLineage.enabled true

3. Created a catalog lineage_data and a schema within that called lineagedemo managed by the Unity metastore.

4. Executed the following script with the previously created Spark Cluster (lineage demo: https://docs.databricks.com/data-governance/unity-catalog/data-lineage.html ๐Ÿ˜ž

CREATE TABLE IF NOT EXISTS
  lineage_data.lineagedemo.menu (
    recipe_id INT,
    app string,
    main string,
    dessert string
  );
 
INSERT INTO lineage_data.lineagedemo.menu
    (recipe_id, app, main, dessert)
VALUES
    (1,"Ceviche", "Tacos", "Flan"),
    (2,"Tomato Soup", "Souffle", "Cree Brulee"),
    (3,"Chips","Grilled Cheese","Cheesecake");
 
CREATE TABLE
  lineage_data.lineagedemo.dinner
AS SELECT
  recipe_id, concat(app," + ", main," + ",dessert)
AS
  full_menu
FROM
  lineage_data.lineagedemo.menu
 

5. Go to the data explorer and open the lineage tab for the table dinner.

No lineage is visible.. What Am i doing wrong? Where can I find more logging regarding the lineage extraction?

3 REPLIES 3

stephansmit
New Contributor III

I can see the data inside the tables by querying them. So I do have privileges to the see the lineage.

L_Favre
New Contributor II

We have the same problem from the beginning and Microsoft support is not able to bring solutions to this since mid-december (a case is open).

In beginning of december, we noticed no lineage data was showing up depsite GA and we decided to try the simple example from the official doc without any success either.

We also tried many alternatives either ourself or asked by the support without sucess:

  • Run notebook as job
  • Grant select / all to users the whole catalog/schema
  • ommit/add the spark lineage flag option
  • create tables from scratch

The cluster logs seems empty as well.

We tried in other workspaces, with others metastore without any sucess.

@Stephan Smitโ€‹ Are your workspace deployed in Azure as well ? IF yes, which region ?

L_Favre
New Contributor II

@Stephan Smitโ€‹ We finally got a solution from level 3 support (Databricks support).

You may check your firewall logs.

On our side, we had to open communication to "Event Hub endpoint".

The destination depends on your workspace region: Azure Databricks regions - Azure Databricks | Microsoft Learn

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group