cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Defining Keys

tyas
New Contributor II

Hello,

I have a DataFrame in a Databricks notebook that I've already read and transformed using PySpark-Python. I want to create a table with defined keys (primary and foreign). What is the best method to do this:

  • Create a table and directly define keys
  • SaveAsTable (DELTA format) and then ALTER table

Thanks, Tyas

1 REPLY 1

Hubert-Dudek
Esteemed Contributor III

Remember that keys are for information purposes (they don't validate data integrity). They are used for information in a few places (Feature tables, online tables, PowerBi modelling). The best is to define them in CREATE TABLE syntax, for example:

CREATE TABLE IF NOT EXISTS products (
        product_id INT NOT NULL,
        CONSTRAINT product_id PRIMARY KEY(product_id)
    )

more here https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-table-constraint.html