cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Is Delta table with auto-increment column as unique identifier for delta table is supported? If, yes, how to create that. I am not using Databrics version of Delta.

Delta
New Contributor II
 
2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @Rahul Kumarโ€‹ ,

Databricks have IDENTITY columns as described here.

CREATE TABLE [USING] (Databricks SQL)

It defines a managed or external table, optionally using a data source.

{ { [CREATE OR] REPLACE TABLE | CREATE TABLE [ IF NOT EXISTS ] }
  table_name
  [ column_specification ] [ USING data_source ]
  [ table_clauses ]
  [ AS query ] }
 
column_specification
  ( { column_identifier column_type [ NOT NULL ]
      [ GENERATED ALWAYS AS ( expr ) |
        GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( [ START WITH start ] [ INCREMENT BY step ] ) ] ]
      [ COMMENT column_comment ] } [, ...] )
 
table_clauses
  { OPTIONS clause |
    PARTITIONED BY clause |
    clustered_by_clause |
    LOCATION path [ WITH ( CREDENTIAL credential_name ) ] |
    COMMENT table_comment |
    TBLPROPERTIES clause } [...]
 
clustered_by_clause
  { CLUSTERED BY ( cluster_column [, ...] )
    [ SORTED BY ( { sort_column [ ASC | DESC ] } [, ...] ) ]
    INTO num_buckets BUCKETS }
GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( [ START WITH start ] [ INCREMENT BY step ] ) ]

  • This parameter defines an identity column. When you write to the table and do not provide values for the identity column, it automatically assigns a unique and statistically increasing (or decreasing if the step is negative) value. This clause is only supported for Delta Lake tables. This clause can only be used for columns with BIGINT data type.
  • The automatically assigned values start with start and increment by step. Assigned values are unique but are not guaranteed to be contiguous. Both parameters are optional, and the default value is 1. step cannot be 0.
  • If the automatically assigned values are beyond the range of the identity column type, the query will fail. When ALWAYS is used, you cannot provide your values for the identity column.
  • The following operations are not supported:
    • PARTITIONED BY an identity column
    • UPDATE an identity column

Example:

create table gen1 (
 
     id long GENERATED ALWAYS AS IDENTITY
 
   , t string
 
)

It requires Runtime version 10.4 or above.

Anonymous
Not applicable

Hey @Rahul Kumarโ€‹ 

Hope everything is going great.

Just checking in. Does @Kaniz Fatmaโ€‹'s response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? Else please let us know if you need more help. 

Cheers!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.