cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Why is my lineage extraction not showing up in the Unity Catalog

stephansmit
New Contributor III

Im trying to get the lineage graph to work in Unity catalog, however nothing seems to appear even though I followed the docs. I did the following steps

1. Created a Unity metastore and attached the workspace to that metastore.

2. Created a Single user Spark 11.2 cluster within the linked workspace with the following option:

spark.databricks.dataLineage.enabled true

3. Created a catalog lineage_data and a schema within that called lineagedemo managed by the Unity metastore.

4. Executed the following script with the previously created Spark Cluster (lineage demo: https://docs.databricks.com/data-governance/unity-catalog/data-lineage.html 😞

CREATE TABLE IF NOT EXISTS
  lineage_data.lineagedemo.menu (
    recipe_id INT,
    app string,
    main string,
    dessert string
  );
 
INSERT INTO lineage_data.lineagedemo.menu
    (recipe_id, app, main, dessert)
VALUES
    (1,"Ceviche", "Tacos", "Flan"),
    (2,"Tomato Soup", "Souffle", "Cree Brulee"),
    (3,"Chips","Grilled Cheese","Cheesecake");
 
CREATE TABLE
  lineage_data.lineagedemo.dinner
AS SELECT
  recipe_id, concat(app," + ", main," + ",dessert)
AS
  full_menu
FROM
  lineage_data.lineagedemo.menu
 

5. Go to the data explorer and open the lineage tab for the table dinner.

No lineage is visible.. What Am i doing wrong? Where can I find more logging regarding the lineage extraction?

3 REPLIES 3

stephansmit
New Contributor III

I can see the data inside the tables by querying them. So I do have privileges to the see the lineage.

L_Favre
New Contributor II

We have the same problem from the beginning and Microsoft support is not able to bring solutions to this since mid-december (a case is open).

In beginning of december, we noticed no lineage data was showing up depsite GA and we decided to try the simple example from the official doc without any success either.

We also tried many alternatives either ourself or asked by the support without sucess:

  • Run notebook as job
  • Grant select / all to users the whole catalog/schema
  • ommit/add the spark lineage flag option
  • create tables from scratch

The cluster logs seems empty as well.

We tried in other workspaces, with others metastore without any sucess.

@Stephan Smit​ Are your workspace deployed in Azure as well ? IF yes, which region ?

L_Favre
New Contributor II

@Stephan Smit​ We finally got a solution from level 3 support (Databricks support).

You may check your firewall logs.

On our side, we had to open communication to "Event Hub endpoint".

The destination depends on your workspace region: Azure Databricks regions - Azure Databricks | Microsoft Learn

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.