topic Databricks UUID in Warehousing & Analytics

Databricks UUID

David_Dabbs — Mon, 09 Feb 2026 23:27:59 GMT

Is there a plan for Databricks to support the UUIDv7 variant that is friendlier to use in databases for keys and partitioning?
https://python.plainenglish.io/python-3-14-brings-uuidv6-v7-v8-stop-using-uuid4-like-its-2015-90518fdfce81

Also, related, Python 3.1.4 is the first release that natively supports the v7 UUID variant.
Would it be possible to say when one could expect Python 3.1.4 to land in a LTS release?

Thank you.

Re: Databricks UUID

sarahbhord — Thu, 12 Feb 2026 16:19:57 GMT

Hey @David_Dabbs - We do support uuidv7 database keys and partitioning in Lakebase PostgreSQL via pg_uuidv7 and Databricks SQL functions. There are some limitations when it comes to clustering and column generation.

As for Python 3.1.4 - this is an outdated minor release, and modern Databricks runtimes target versions 3.10-3.12 range. Are you inquiring about "3.11.4"? If so, Python 3.11.4 will not land in a Databricks Runtime LTS. Instead, 15.4 LTS already standardizes on Python 3.11.11, and newer LTS lines use Python 3.12.3.

I hope this helps!

Sarah

Re: Databricks UUID

David_Dabbs — Thu, 12 Feb 2026 16:50:55 GMT

Thank you @sarahbhord.

We do support uuidv7 database keys and partitioning in Lakebase PostgreSQL via pg_uuidv7 and Databricks SQL functions. There are some limitations when it comes to clustering and column generation.

I'm not following the Neon Databricks SQL connection. We would want to use uuidv7 in Databricks, not Neon.

As for Python 3.1.4 - this is an outdated minor release, and modern Databricks runtimes target versions 3.10-3.12 range. Are you inquiring about "3.11.4"? If so, Python 3.11.4 will not land in a Databricks Runtime LTS. Instead, 15.4 LTS already standardizes on Python 3.11.11, and newer LTS lines use Python 3.12.3.

Yes, 3.1.4 was a typo on my part. UUIDv7 support landed in Python with version 3.14.

Re: Databricks UUID

sarahbhord — Thu, 12 Feb 2026 19:26:29 GMT

Ah yes apologies - that was confusing. To implement uuidv7 in DATABRICKS (using Databricks SQL) (without relying on Neon/Postgres), you can leverage Databricks' native support for the uuid() function (v4) and standard SQL to construct a v7-compliant identifier... AKA you can create a SQL UDF to generate them.

CREATE OR REPLACE FUNCTION generate_uuidv7() RETURNS STRING LANGUAGE SQL AS SELECT printf('%012x-%s-%s-%s-%s', -- 48-bit timestamp in milliseconds CAST(unix_millis(current_timestamp()) AS LONG), -- Version 7 and first 12 random bits (hex starts with '7') substring(hex(random()), 1, 4), -- Variant 1 and next 12 random bits (hex starts with 8, 9, A, or B) substring(hex(random()), 5, 4), substring(hex(random()), 9, 4), substring(hex(random()), 13, 12) );

For partitioning - I will mention that using a high-cardinality ID could create the "small file problem". Instead, parition by the date of the time component of the uuidv7. Alternatively, if you are using Delta Lake, use Liquid Clustering as it handles high-cardinality keys much better.

CREATE TABLE events ( id STRING, event_date DATE GENERATED ALWAYS AS ( CAST(from_unixtime(conv(substring(id, 1, 12), 16, 10) / 1000) AS DATE) ) ) PARTITIONED BY (event_date);