Databricks Community

SRK · ‎12-08-2022

In this blog I can see for dimension and fact tables, the primary key constraint has been applied. Following is the example:

-- Store dimension

CREATE OR REPLACE TABLE dim_store(

store_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,

business_key STRING,

name STRING,

email STRING,

city STRING,

address STRING,

phone_number STRING,

created_date TIMESTAMP,

updated_date TIMESTAMP,

start_at TIMESTAMP,

end_at TIMESTAMP

);

I want to apply the same for Delta Live Tables. Something like this:

CREATE OR REFRESH STREAMING LIVE TABLE dim_store(

store_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,

business_key STRING,

name STRING,

email STRING,

city STRING,

address STRING,

phone_number STRING,

created_date TIMESTAMP,

updated_date TIMESTAMP,

start_at TIMESTAMP,

end_at TIMESTAMP

);

However, when I run the Delta Live Pipeline. It is throwing following error:

Unsupported SQL statement for table 'dim_store': Missing query is not supported.

Can any one please help me and tell me how to apply Primary key constraint for Delta Live Table. I know Databricks does not support enforcement of the PK/FK relationship. However, I want the PK/FK constraints are for informational only.

Kindly help here.

-werners- · ‎12-08-2022

I don't think that it is possible (yet).

AFAIK you can only have expectations:

https://learn.microsoft.com/en-us/azure/databricks/workflows/delta-live-tables/delta-live-tables-exp....

But DLT is pretty new, so it might get added later on

SRK · ‎12-08-2022

Thanks for the reply Werners.

Harun · ‎12-08-2022

I second you. Only expectations are currently available in Delta live tables to maintain the data quality. We may expect other constarints in future releases.

youssefmrini · ‎12-09-2022

Delta Tables on Unity Catalog has PK/FK information (not enforced). Since DLT will support soon UC, I guess they will add this feature.

Oliver_Angelil · ‎10-24-2023

@SRK The documentation shows an example of how you can apply a PK constraint as an Expectation in DLT:

https://docs.databricks.com/en/delta-live-tables/expectations.html#perform-advanced-validation-with-...

Oliver_Angelil · ‎10-24-2023

@SRK Please see a copy of this answer on stackoverflow here.

You can use DLT Expectations to have this check (see my previous answer if you're using SQL and not Python):

@dlt.table(
name="table1",
)
def create_df():
schema = T.StructType([
T.StructField("id", T.IntegerType(), True),
T.StructField("name", T.StringType(), True),
T.StructField("age", T.IntegerType(), True)])

data = [(1, "Alice", 25),
(1, "Bob", 30),
(3, "Charlie", 40)]

df = spark.createDataFrame(data, schema)
return df

@dlt.table(name="table2")
@dlt.expect("unique_pk", "num_entries = 1")
def create_df():
df = dlt.read("table1")
df = df.groupBy("id").count().withColumnRenamed("count","num_entries")
return df

Databricks Community

How to apply Primary Key constraint in Delta Live Table?

Photos

Join Us as a Local Community Builder!

Business Intelligence in the Era of AI

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Databricks Community Champion - March 2025 - Takuya Omi

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.