Databricks Community

Merchiv · ‎11-30-2022

I have a Merge into statement that I use to update existing entries or create new entries in a dimension table based on a natural business key.

When creating new entries I would like to also create a unique uuid for that entry that I can use to crossreference with other tables.

However the merge into statement doesn't allow me to use `uuid()` .

Example code:

%sql WITH newData AS (
  SELECT
    MAC,
    IP,
    Name
  FROM
    bronze.ComputerLogs
) MERGE INTO silver.d_ComputerInfo AS s USING newData ON s.MAC = newData.MAC
WHEN MATCHED THEN
UPDATE
SET
  s.IP = newData.IP,
  s.Name = newData.Name,
  WHEN NOT MATCHED THEN
INSERT
  (
    ComputerInfoId,
    MAC,
    IP,
    Name
  )
VALUES
  (uuid(), MAC, IP, NAME)

This gives the following error

AnalysisException: nondeterministic expressions are only allowed in
Project, Filter, Aggregate, Window, or Generate, but found:
...

Is there a way to bypass this and still use randomly generated ids?

-werners- · ‎12-01-2022

A ok, I thought you were looking for that.

In that case:

have you tested adding the UUID in the newData DF beforehand?

Because it could be a limitiation of the merge itself.

View solution in original post

-werners- · ‎11-30-2022

you might wanna look into an identity column, which is possible now in delta lake.

https://www.databricks.com/blog/2022/08/08/identity-columns-to-generate-surrogate-keys-are-now-avail...

Merchiv · ‎12-01-2022

We have used tested this, but these are incremental keys that require syncing if we manually insert some extra keys, which makes them not practical.

-werners- · ‎12-01-2022

A ok, I thought you were looking for that.

In that case:

have you tested adding the UUID in the newData DF beforehand?

Because it could be a limitiation of the merge itself.

Databricks Community

How to use uuid in SQL merge into statement

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!