<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to apply Primary Key constraint in Delta Live Table? in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17843#M982</link>
    <description>&lt;P&gt;In &lt;A href="https://www.databricks.com/blog/2022/10/20/data-modeling-best-practices-implementation-modern-lakehouse.html?_ga=2.25467645.1734247389.1670501677-426402172.1667377962" alt="https://www.databricks.com/blog/2022/10/20/data-modeling-best-practices-implementation-modern-lakehouse.html?_ga=2.25467645.1734247389.1670501677-426402172.1667377962" target="_blank"&gt;this &lt;/A&gt;blog I can see for dimension and fact tables, the primary key constraint has been applied. Following is the example:&lt;/P&gt;&lt;P&gt;&lt;I&gt;-- Store dimension&lt;/I&gt;&lt;/P&gt;&lt;P&gt;CREATE OR REPLACE TABLE dim_store(&lt;/P&gt;&lt;P&gt;  store_id BIGINT GENERATED ALWAYS AS IDENTITY &lt;B&gt;&lt;U&gt;PRIMARY KEY&lt;/U&gt;&lt;/B&gt;,&lt;/P&gt;&lt;P&gt;  business_key STRING,&lt;/P&gt;&lt;P&gt;  name STRING,&lt;/P&gt;&lt;P&gt;  email STRING,&lt;/P&gt;&lt;P&gt;  city STRING,&lt;/P&gt;&lt;P&gt;  address STRING,&lt;/P&gt;&lt;P&gt;  phone_number STRING,&lt;/P&gt;&lt;P&gt;  created_date TIMESTAMP,&lt;/P&gt;&lt;P&gt;  updated_date TIMESTAMP,&lt;/P&gt;&lt;P&gt;  start_at TIMESTAMP,&lt;/P&gt;&lt;P&gt;  end_at TIMESTAMP&lt;/P&gt;&lt;P&gt;);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I want to apply the same for &lt;B&gt;Delta Live Tables&lt;/B&gt;. Something like this:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;CREATE OR REFRESH STREAMING LIVE TABLE dim_store(&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;  store_id BIGINT GENERATED ALWAYS AS IDENTITY &lt;B&gt;&lt;U&gt;PRIMARY KEY&lt;/U&gt;&lt;/B&gt;,&lt;/P&gt;&lt;P&gt;  business_key STRING,&lt;/P&gt;&lt;P&gt;  name STRING,&lt;/P&gt;&lt;P&gt;  email STRING,&lt;/P&gt;&lt;P&gt;  city STRING,&lt;/P&gt;&lt;P&gt;  address STRING,&lt;/P&gt;&lt;P&gt;  phone_number STRING,&lt;/P&gt;&lt;P&gt;  created_date TIMESTAMP,&lt;/P&gt;&lt;P&gt;  updated_date TIMESTAMP,&lt;/P&gt;&lt;P&gt;  start_at TIMESTAMP,&lt;/P&gt;&lt;P&gt;  end_at TIMESTAMP&lt;/P&gt;&lt;P&gt;);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, when I run the Delta Live Pipeline. It is throwing following error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;Unsupported SQL statement for table 'dim_store': Missing query is not supported.&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can any one please help me and tell me how to apply Primary key constraint for Delta Live Table. I know &lt;B&gt;Databricks does not support enforcement of the PK/FK relationship. &lt;/B&gt;However, I want &amp;nbsp;the &lt;B&gt;PK/FK &lt;/B&gt;constraints are for informational only.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;Kindly help here.&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 08 Dec 2022 14:29:20 GMT</pubDate>
    <dc:creator>SRK</dc:creator>
    <dc:date>2022-12-08T14:29:20Z</dc:date>
    <item>
      <title>How to apply Primary Key constraint in Delta Live Table?</title>
      <link>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17843#M982</link>
      <description>&lt;P&gt;In &lt;A href="https://www.databricks.com/blog/2022/10/20/data-modeling-best-practices-implementation-modern-lakehouse.html?_ga=2.25467645.1734247389.1670501677-426402172.1667377962" alt="https://www.databricks.com/blog/2022/10/20/data-modeling-best-practices-implementation-modern-lakehouse.html?_ga=2.25467645.1734247389.1670501677-426402172.1667377962" target="_blank"&gt;this &lt;/A&gt;blog I can see for dimension and fact tables, the primary key constraint has been applied. Following is the example:&lt;/P&gt;&lt;P&gt;&lt;I&gt;-- Store dimension&lt;/I&gt;&lt;/P&gt;&lt;P&gt;CREATE OR REPLACE TABLE dim_store(&lt;/P&gt;&lt;P&gt;  store_id BIGINT GENERATED ALWAYS AS IDENTITY &lt;B&gt;&lt;U&gt;PRIMARY KEY&lt;/U&gt;&lt;/B&gt;,&lt;/P&gt;&lt;P&gt;  business_key STRING,&lt;/P&gt;&lt;P&gt;  name STRING,&lt;/P&gt;&lt;P&gt;  email STRING,&lt;/P&gt;&lt;P&gt;  city STRING,&lt;/P&gt;&lt;P&gt;  address STRING,&lt;/P&gt;&lt;P&gt;  phone_number STRING,&lt;/P&gt;&lt;P&gt;  created_date TIMESTAMP,&lt;/P&gt;&lt;P&gt;  updated_date TIMESTAMP,&lt;/P&gt;&lt;P&gt;  start_at TIMESTAMP,&lt;/P&gt;&lt;P&gt;  end_at TIMESTAMP&lt;/P&gt;&lt;P&gt;);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I want to apply the same for &lt;B&gt;Delta Live Tables&lt;/B&gt;. Something like this:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;CREATE OR REFRESH STREAMING LIVE TABLE dim_store(&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;  store_id BIGINT GENERATED ALWAYS AS IDENTITY &lt;B&gt;&lt;U&gt;PRIMARY KEY&lt;/U&gt;&lt;/B&gt;,&lt;/P&gt;&lt;P&gt;  business_key STRING,&lt;/P&gt;&lt;P&gt;  name STRING,&lt;/P&gt;&lt;P&gt;  email STRING,&lt;/P&gt;&lt;P&gt;  city STRING,&lt;/P&gt;&lt;P&gt;  address STRING,&lt;/P&gt;&lt;P&gt;  phone_number STRING,&lt;/P&gt;&lt;P&gt;  created_date TIMESTAMP,&lt;/P&gt;&lt;P&gt;  updated_date TIMESTAMP,&lt;/P&gt;&lt;P&gt;  start_at TIMESTAMP,&lt;/P&gt;&lt;P&gt;  end_at TIMESTAMP&lt;/P&gt;&lt;P&gt;);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, when I run the Delta Live Pipeline. It is throwing following error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;Unsupported SQL statement for table 'dim_store': Missing query is not supported.&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can any one please help me and tell me how to apply Primary key constraint for Delta Live Table. I know &lt;B&gt;Databricks does not support enforcement of the PK/FK relationship. &lt;/B&gt;However, I want &amp;nbsp;the &lt;B&gt;PK/FK &lt;/B&gt;constraints are for informational only.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;Kindly help here.&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2022 14:29:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17843#M982</guid>
      <dc:creator>SRK</dc:creator>
      <dc:date>2022-12-08T14:29:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to apply Primary Key constraint in Delta Live Table?</title>
      <link>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17844#M983</link>
      <description>&lt;P&gt;I don't think that it is possible (yet).&lt;/P&gt;&lt;P&gt;AFAIK you can only have expectations:&lt;/P&gt;&lt;P&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/workflows/delta-live-tables/delta-live-tables-expectations" target="test_blank"&gt;https://learn.microsoft.com/en-us/azure/databricks/workflows/delta-live-tables/delta-live-tables-expectations&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;But DLT is pretty new, so it might get added later on&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2022 14:54:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17844#M983</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-12-08T14:54:33Z</dc:date>
    </item>
    <item>
      <title>Re: How to apply Primary Key constraint in Delta Live Table?</title>
      <link>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17845#M984</link>
      <description>&lt;P&gt;Thanks for the reply Werners.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2022 15:56:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17845#M984</guid>
      <dc:creator>SRK</dc:creator>
      <dc:date>2022-12-08T15:56:10Z</dc:date>
    </item>
    <item>
      <title>Re: How to apply Primary Key constraint in Delta Live Table?</title>
      <link>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17846#M985</link>
      <description>&lt;P&gt;I second you. Only expectations are currently available in Delta live tables to maintain the data quality. We may expect other constarints in future releases.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2022 20:59:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17846#M985</guid>
      <dc:creator>Harun</dc:creator>
      <dc:date>2022-12-08T20:59:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to apply Primary Key constraint in Delta Live Table?</title>
      <link>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17847#M986</link>
      <description>&lt;P&gt;Delta Tables on Unity Catalog has PK/FK information (not enforced). Since DLT will support soon UC, I guess they will add this feature.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Dec 2022 09:41:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/17847#M986</guid>
      <dc:creator>youssefmrini</dc:creator>
      <dc:date>2022-12-09T09:41:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to apply Primary Key constraint in Delta Live Table?</title>
      <link>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/49794#M2697</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/67846"&gt;@SRK&lt;/a&gt;&amp;nbsp;&amp;nbsp;The documentation shows an example of how you can apply a PK constraint as an Expectation in DLT:&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/en/delta-live-tables/expectations.html#perform-advanced-validation-with-delta-live-tables-expectations" target="_blank"&gt;https://docs.databricks.com/en/delta-live-tables/expectations.html#perform-advanced-validation-with-delta-live-tables-expectations&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2023 13:25:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/49794#M2697</guid>
      <dc:creator>Oliver_Angelil</dc:creator>
      <dc:date>2023-10-24T13:25:03Z</dc:date>
    </item>
    <item>
      <title>Re: How to apply Primary Key constraint in Delta Live Table?</title>
      <link>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/49797#M2698</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/67846"&gt;@SRK&lt;/a&gt;&amp;nbsp;Please see a copy of this answer on stackoverflow &lt;A href="https://stackoverflow.com/a/77353585/5392289" target="_self"&gt;here&lt;/A&gt;.&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can use DLT Expectations to have this check (see my previous answer if you're using SQL and not Python):&lt;/P&gt;&lt;P&gt;@dlt.table(&lt;BR /&gt;name="table1",&lt;BR /&gt;)&lt;BR /&gt;def create_df():&lt;BR /&gt;schema = T.StructType([&lt;BR /&gt;T.StructField("id", T.IntegerType(), True),&lt;BR /&gt;T.StructField("name", T.StringType(), True),&lt;BR /&gt;T.StructField("age", T.IntegerType(), True)])&lt;/P&gt;&lt;P&gt;data = [(1, "Alice", 25),&lt;BR /&gt;(1, "Bob", 30),&lt;BR /&gt;(3, "Charlie", 40)]&lt;/P&gt;&lt;P&gt;df = spark.createDataFrame(data, schema)&lt;BR /&gt;return df&lt;/P&gt;&lt;P&gt;@dlt.table(name="table2")&lt;BR /&gt;@dlt.expect("unique_pk", "num_entries = 1")&lt;BR /&gt;def create_df():&lt;BR /&gt;df = dlt.read("table1")&lt;BR /&gt;df = df.groupBy("id").count().withColumnRenamed("count","num_entries")&lt;BR /&gt;return df&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2023 15:52:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/how-to-apply-primary-key-constraint-in-delta-live-table/m-p/49797#M2698</guid>
      <dc:creator>Oliver_Angelil</dc:creator>
      <dc:date>2023-10-24T15:52:14Z</dc:date>
    </item>
  </channel>
</rss>

