Ah yes apologies - that was confusing. To implement uuidv7 in DATABRICKS (using Databricks SQL) (without relying on Neon/Postgres), you can leverage Databricks' native support for the uuid() function (v4) and standard SQL to construct a v7-compliant identifier... AKA you can create a SQL UDF to generate them.
CREATE OR REPLACE FUNCTION generate_uuidv7()
RETURNS STRING
LANGUAGE SQL
AS
SELECT
printf('%012x-%s-%s-%s-%s',
-- 48-bit timestamp in milliseconds
CAST(unix_millis(current_timestamp()) AS LONG),
-- Version 7 and first 12 random bits (hex starts with '7')
substring(hex(random()), 1, 4),
-- Variant 1 and next 12 random bits (hex starts with 8, 9, A, or B)
substring(hex(random()), 5, 4),
substring(hex(random()), 9, 4),
substring(hex(random()), 13, 12)
);
For partitioning - I will mention that using a high-cardinality ID could create the "small file problem". Instead, parition by the date of the time component of the uuidv7. Alternatively, if you are using Delta Lake, use Liquid Clustering as it handles high-cardinality keys much better.
CREATE TABLE events (
id STRING,
event_date DATE GENERATED ALWAYS AS (
CAST(from_unixtime(conv(substring(id, 1, 12), 16, 10) / 1000) AS DATE)
)
)
PARTITIONED BY (event_date);