cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Missing Delta-live-Table in hive-metastore catalog

BobCat62
New Contributor II

Hi experts,

I defined my delta table in an external location as following:

%sql
CREATE OR REFRESH STREAMING TABLE pumpdata (
Body string,
EnqueuedTimeUtc string,
SystemProperties string,
_rescued_data string,
Properties string
)
USING DELTA
LOCATION 'abfss://mdwh@XXXX.dfs.core.windows.net/Bronze/pumpdata'

 

I have a delta live table pipeline with theses settings:

As you can see, I have defined the same external location and set hive Metastore as storage option:

Bild1.png

and this definition:

import dlt
from pyspark.sql.functions import col
json_path = f"abfss://schachtwasser@XXXX.dfs.core.windows.net/XXXX/*/*/*/*/*.JSON"
@dlt.create_table(
name="pumpdata",
table_properties={
"quality": "raw"
},
comment="Data ingested from an ADLS2 storage account."
)
def pumpdata():
return (
spark.readStream.format("cloudFiles")
.option("cloudFiles.format", "JSON")
.load(json_path)
)

I can successfully run my DLT and parquet files are put in the storage account, but in the catalog under hive-meta store, I cannot see my table :

Bild2.pngBild3.pngBild4.png

1 ACCEPTED SOLUTION

Accepted Solutions

ashraf1395
Valued Contributor III

Hey @BobCat62 , This might help

ashraf1395_0-1740203798056.png

dlt will be in direct publishingmode by default. If you select hive_metstore you must specify the default schema in the dlt pipeline setting. If not done there. At the time of defining the dlt table pass the schema_name.pumpdata.

example default.pumpdata. - This will store the table in default schema onf hive_metastore

View solution in original post

2 REPLIES 2

KaranamS
Contributor II

Hi @BobCat62 , Try these steps 

1. Try manually registering the table in hive_metastore. Run this in a databricks notebook 

CREATE TABLE hive_metastore.default.pumpdata USING DELTA LOCATION 'abfss://mdwh@XXXX.dfs.core.windows.net/Bronze/pumpdata/tables/pumpdata';

2. Then verify the table by running this 

SHOW TABLES IN hive_metastore.default

This should create a table under the hive_metastore

ashraf1395
Valued Contributor III

Hey @BobCat62 , This might help

ashraf1395_0-1740203798056.png

dlt will be in direct publishingmode by default. If you select hive_metstore you must specify the default schema in the dlt pipeline setting. If not done there. At the time of defining the dlt table pass the schema_name.pumpdata.

example default.pumpdata. - This will store the table in default schema onf hive_metastore

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group