cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to apply Primary Key constraint in Delta Live Table?

SRK
Contributor III

In this blog I can see for dimension and fact tables, the primary key constraint has been applied. Following is the example:

-- Store dimension

CREATE OR REPLACE TABLE dim_store(

store_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,

business_key STRING,

name STRING,

email STRING,

city STRING,

address STRING,

phone_number STRING,

created_date TIMESTAMP,

updated_date TIMESTAMP,

start_at TIMESTAMP,

end_at TIMESTAMP

);

I want to apply the same for Delta Live Tables. Something like this:

CREATE OR REFRESH STREAMING LIVE TABLE dim_store(

store_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,

business_key STRING,

name STRING,

email STRING,

city STRING,

address STRING,

phone_number STRING,

created_date TIMESTAMP,

updated_date TIMESTAMP,

start_at TIMESTAMP,

end_at TIMESTAMP

);

However, when I run the Delta Live Pipeline. It is throwing following error:

Unsupported SQL statement for table 'dim_store': Missing query is not supported.

Can any one please help me and tell me how to apply Primary key constraint for Delta Live Table. I know Databricks does not support enforcement of the PK/FK relationship. However, I want  the PK/FK constraints are for informational only.

Kindly help here.

6 REPLIES 6

-werners-
Esteemed Contributor III

I don't think that it is possible (yet).

AFAIK you can only have expectations:

https://learn.microsoft.com/en-us/azure/databricks/workflows/delta-live-tables/delta-live-tables-exp....

But DLT is pretty new, so it might get added later on

SRK
Contributor III

Thanks for the reply Werners.

Harun
Honored Contributor

I second you. Only expectations are currently available in Delta live tables to maintain the data quality. We may expect other constarints in future releases.

youssefmrini
Honored Contributor III
Honored Contributor III

Delta Tables on Unity Catalog has PK/FK information (not enforced). Since DLT will support soon UC, I guess they will add this feature.

Oliver_Angelil
Valued Contributor II

@SRK  The documentation shows an example of how you can apply a PK constraint as an Expectation in DLT:

https://docs.databricks.com/en/delta-live-tables/expectations.html#perform-advanced-validation-with-...

Oliver_Angelil
Valued Contributor II

@SRK Please see a copy of this answer on stackoverflow here

You can use DLT Expectations to have this check (see my previous answer if you're using SQL and not Python):

@dlt.table(
name="table1",
)
def create_df():
schema = T.StructType([
T.StructField("id", T.IntegerType(), True),
T.StructField("name", T.StringType(), True),
T.StructField("age", T.IntegerType(), True)])

data = [(1, "Alice", 25),
(1, "Bob", 30),
(3, "Charlie", 40)]

df = spark.createDataFrame(data, schema)
return df

@dlt.table(name="table2")
@dlt.expect("unique_pk", "num_entries = 1")
def create_df():
df = dlt.read("table1")
df = df.groupBy("id").count().withColumnRenamed("count","num_entries")
return df

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!