- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-10-2025 06:20 AM
The docs for AUTO CDC API state
You must specify a column in the source data on which to sequence records, which Lakeflow Declarative Pipelines interprets as a monotonically increasing representation of the proper ordering of the source data.
Can this be something other than an integer, like a timestamp or UUID v7?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-10-2025 06:37 AM
OK later on the docs show a timestamp example (https://learn.microsoft.com/en-us/azure/databricks/dlt/cdc#use-multiple-columns-for-sequencing) but I'm still curious about a UUID v7
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-10-2025 06:53 AM
Hi @Rjdudley ,
They should also work because UUID v7 are generally monotonically increasing
For example, this is excerpt from Postgre SQL implementation:
* Generate UUID version 7 per RFC 9562, with the given timestamp.
*
* UUID version 7 consists of a Unix timestamp in milliseconds (48
* bits) and 74 random bits, excluding the required version and
* variant bits. To ensure monotonicity in scenarios of high-
* frequency UUID generation, we employ the method "Replace
* LeftmostRandom Bits with Increased Clock Precision (Method 3)",
* described in the RFC. […]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2025 11:49 AM
Thanks Szymon, I'm familiar with the Postgre SQL implementation and was hoping Databricks would behave the same.