cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to apply Primary Key constraint in Delta Live Table?

SRK
Contributor III

In this blog I can see for dimension and fact tables, the primary key constraint has been applied. Following is the example:

-- Store dimension

CREATE OR REPLACE TABLE dim_store(

store_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,

business_key STRING,

name STRING,

email STRING,

city STRING,

address STRING,

phone_number STRING,

created_date TIMESTAMP,

updated_date TIMESTAMP,

start_at TIMESTAMP,

end_at TIMESTAMP

);

I want to apply the same for Delta Live Tables. Something like this:

CREATE OR REFRESH STREAMING LIVE TABLE dim_store(

store_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,

business_key STRING,

name STRING,

email STRING,

city STRING,

address STRING,

phone_number STRING,

created_date TIMESTAMP,

updated_date TIMESTAMP,

start_at TIMESTAMP,

end_at TIMESTAMP

);

However, when I run the Delta Live Pipeline. It is throwing following error:

Unsupported SQL statement for table 'dim_store': Missing query is not supported.

Can any one please help me and tell me how to apply Primary key constraint for Delta Live Table. I know Databricks does not support enforcement of the PK/FK relationship. However, I want  the PK/FK constraints are for informational only.

Kindly help here.

6 REPLIES 6

-werners-
Esteemed Contributor III

I don't think that it is possible (yet).

AFAIK you can only have expectations:

https://learn.microsoft.com/en-us/azure/databricks/workflows/delta-live-tables/delta-live-tables-exp....

But DLT is pretty new, so it might get added later on

SRK
Contributor III

Thanks for the reply Werners.

Harun
Honored Contributor

I second you. Only expectations are currently available in Delta live tables to maintain the data quality. We may expect other constarints in future releases.

youssefmrini
Honored Contributor III

Delta Tables on Unity Catalog has PK/FK information (not enforced). Since DLT will support soon UC, I guess they will add this feature.

Oliver_Angelil
Valued Contributor II

@SRK  The documentation shows an example of how you can apply a PK constraint as an Expectation in DLT:

https://docs.databricks.com/en/delta-live-tables/expectations.html#perform-advanced-validation-with-...

Oliver_Angelil
Valued Contributor II

@SRK Please see a copy of this answer on stackoverflow here

You can use DLT Expectations to have this check (see my previous answer if you're using SQL and not Python):

@dlt.table(
name="table1",
)
def create_df():
schema = T.StructType([
T.StructField("id", T.IntegerType(), True),
T.StructField("name", T.StringType(), True),
T.StructField("age", T.IntegerType(), True)])

data = [(1, "Alice", 25),
(1, "Bob", 30),
(3, "Charlie", 40)]

df = spark.createDataFrame(data, schema)
return df

@dlt.table(name="table2")
@dlt.expect("unique_pk", "num_entries = 1")
def create_df():
df = dlt.read("table1")
df = df.groupBy("id").count().withColumnRenamed("count","num_entries")
return df

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group