cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to use uuid in SQL merge into statement

Merchiv
New Contributor III

I have a Merge into statement that I use to update existing entries or create new entries in a dimension table based on a natural business key.

When creating new entries I would like to also create a unique uuid for that entry that I can use to crossreference with other tables.

However the merge into statement doesn't allow me to use `uuid()` .

Example code:

%sql WITH newData AS (
  SELECT
    MAC,
    IP,
    Name
  FROM
    bronze.ComputerLogs
) MERGE INTO silver.d_ComputerInfo AS s USING newData ON s.MAC = newData.MAC
WHEN MATCHED THEN
UPDATE
SET
  s.IP = newData.IP,
  s.Name = newData.Name,
  WHEN NOT MATCHED THEN
INSERT
  (
    ComputerInfoId,
    MAC,
    IP,
    Name
  )
VALUES
  (uuid(), MAC, IP, NAME)

This gives the following error

AnalysisException: nondeterministic expressions are only allowed in
Project, Filter, Aggregate, Window, or Generate, but found:
...

Is there a way to bypass this and still use randomly generated ids?

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

A ok, I thought you were looking for that.

In that case:

have you tested adding the UUID in the newData DF beforehand?

Because it could be a limitiation of the merge itself.

View solution in original post

3 REPLIES 3

-werners-
Esteemed Contributor III

you might wanna look into an identity column, which is possible now in delta lake.

https://www.databricks.com/blog/2022/08/08/identity-columns-to-generate-surrogate-keys-are-now-avail...

Merchiv
New Contributor III

We have used tested this, but these are incremental keys that require syncing if we manually insert some extra keys, which makes them not practical.

-werners-
Esteemed Contributor III

A ok, I thought you were looking for that.

In that case:

have you tested adding the UUID in the newData DF beforehand?

Because it could be a limitiation of the merge itself.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group