Databricks Community

David_Dabbs · ‎02-09-2026

Is there a plan for Databricks to support the UUIDv7 variant that is friendlier to use in databases for keys and partitioning?
https://python.plainenglish.io/python-3-14-brings-uuidv6-v7-v8-stop-using-uuid4-like-its-2015-90518f...

Also, related, Python 3.1.4 is the first release that natively supports the v7 UUID variant.
Would it be possible to say when one could expect Python 3.1.4 to land in a LTS release?

Thank you.

sarahbhord · ‎02-12-2026

Ah yes apologies - that was confusing. To implement uuidv7 in DATABRICKS (using Databricks SQL) (without relying on Neon/Postgres), you can leverage Databricks' native support for the uuid() function (v4) and standard SQL to construct a v7-compliant identifier... AKA you can create a SQL UDF to generate them.

CREATE OR REPLACE FUNCTION generate_uuidv7()
RETURNS STRING
LANGUAGE SQL
AS
  SELECT 
    printf('%012x-%s-%s-%s-%s',
      -- 48-bit timestamp in milliseconds
      CAST(unix_millis(current_timestamp()) AS LONG),
      -- Version 7 and first 12 random bits (hex starts with '7')
      substring(hex(random()), 1, 4),
      -- Variant 1 and next 12 random bits (hex starts with 8, 9, A, or B)
      substring(hex(random()), 5, 4),
      substring(hex(random()), 9, 4),
      substring(hex(random()), 13, 12)
    );

For partitioning - I will mention that using a high-cardinality ID could create the "small file problem". Instead, parition by the date of the time component of the uuidv7. Alternatively, if you are using Delta Lake, use Liquid Clustering as it handles high-cardinality keys much better.

CREATE TABLE events (
  id STRING,
  event_date DATE GENERATED ALWAYS AS (
    CAST(from_unixtime(conv(substring(id, 1, 12), 16, 10) / 1000) AS DATE)
  )
)
PARTITIONED BY (event_date);

View solution in original post

sarahbhord · ‎02-12-2026

Hey @David_Dabbs - We do support uuidv7 database keys and partitioning in Lakebase PostgreSQL via pg_uuidv7 and Databricks SQL functions. There are some limitations when it comes to clustering and column generation.

As for Python 3.1.4 - this is an outdated minor release, and modern Databricks runtimes target versions 3.10-3.12 range. Are you inquiring about "3.11.4"? If so, Python 3.11.4 will not land in a Databricks Runtime LTS. Instead, 15.4 LTS already standardizes on Python 3.11.11, and newer LTS lines use Python 3.12.3.

I hope this helps!

Sarah

David_Dabbs · ‎02-12-2026

Thank you @sarahbhord.

We do support uuidv7 database keys and partitioning in Lakebase PostgreSQL via pg_uuidv7 and Databricks SQL functions. There are some limitations when it comes to clustering and column generation.

I'm not following the Neon Databricks SQL connection. We would want to use uuidv7 in Databricks, not Neon.

As for Python 3.1.4 - this is an outdated minor release, and modern Databricks runtimes target versions 3.10-3.12 range. Are you inquiring about "3.11.4"? If so, Python 3.11.4 will not land in a Databricks Runtime LTS. Instead, 15.4 LTS already standardizes on Python 3.11.11, and newer LTS lines use Python 3.12.3.

Yes, 3.1.4 was a typo on my part. UUIDv7 support landed in Python with version 3.14.

sarahbhord · ‎02-12-2026

Ah yes apologies - that was confusing. To implement uuidv7 in DATABRICKS (using Databricks SQL) (without relying on Neon/Postgres), you can leverage Databricks' native support for the uuid() function (v4) and standard SQL to construct a v7-compliant identifier... AKA you can create a SQL UDF to generate them.

CREATE OR REPLACE FUNCTION generate_uuidv7()
RETURNS STRING
LANGUAGE SQL
AS
  SELECT 
    printf('%012x-%s-%s-%s-%s',
      -- 48-bit timestamp in milliseconds
      CAST(unix_millis(current_timestamp()) AS LONG),
      -- Version 7 and first 12 random bits (hex starts with '7')
      substring(hex(random()), 1, 4),
      -- Variant 1 and next 12 random bits (hex starts with 8, 9, A, or B)
      substring(hex(random()), 5, 4),
      substring(hex(random()), 9, 4),
      substring(hex(random()), 13, 12)
    );

For partitioning - I will mention that using a high-cardinality ID could create the "small file problem". Instead, parition by the date of the time component of the uuidv7. Alternatively, if you are using Delta Lake, use Liquid Clustering as it handles high-cardinality keys much better.

CREATE TABLE events (
  id STRING,
  event_date DATE GENERATED ALWAYS AS (
    CAST(from_unixtime(conv(substring(id, 1, 12), 16, 10) / 1000) AS DATE)
  )
)
PARTITIONED BY (event_date);

Databricks Community

Databricks UUID

The Next Wave of Enterprise AI | Webinar

🌟 Community Pulse: Your Weekly Roundup! June 29 – July 05, 2026

📌‌ Complete Your Profile – Help Others Get to Know You

Solution Accelerator Series | Identify Fraud With Geospatial Analytics and AI

Databricks Community Champion - June 2026 - Amira Bedhiafi