How to maintain Primary Key Column in Databricks Delta Multi Cluster environment
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-25-2019 04:47 AM
I am trying to replicate the SQL DB like feature of maintaining the Primary Keys in Databrciks Delta approach where the data is being written to Blob Storage such as ADLS2 oe AWS S3.
I want a Auto Incremented Primary key feature using Databricks Delta.
Existing approach - is using the latest row count and maintaining the Primary keys. However, this approach does not suit in parallel processing environment where Primary keys get duplicated data.
Labels: