cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

column "id" is of type uuid but expression is of type character varying.

kp12
New Contributor II

Hello,

I'm trying to write to Azure PostgreSQL flexible  database from Azure Databricks, using PostgreSQL connector in Databricks Runtime in 12.2LTS.

I'm using df.write.format("postgresql").save() to write to PostgreSQL database, but getting the following error:

column "id" is of type uuid but expression is of type character varying.

df has an id column that contains a guid, but is of string type in dataframe and of type uuid in PG database.

As there is no guid or uuid type in Databricks, whet is the best way to insert into an uuid type column?

Thanks,

Kalyani

 

 

 

4 REPLIES 4

Priyanka_Biswas
Valued Contributor
Valued Contributor

Hi @kp12  The error you are encountering is because the PostgreSQL UUID data type is not directly compatible with the string type in Spark DataFrame. You can try casting the string to binary type before writing to the database. Here is how you can do it:

python
from pyspark.sql.functions import expr

df = df.withColumn("id", expr("uuid(id)"))

df.write.format("jdbc").option("url", "jdbc:postgresql://<hostname>:<port>/<database>").option("dbtable", "<table>").option("user", "<username>").option("password", "<password>").save()

In the above code, replace <hostname>, <port>, <database>, <table>, <username>, and <password> with your actual PostgreSQL database details. Please note that you need to use "jdbc" as the format in the write method, not "postgresql". The "uuid()" function in the expr method is used to convert the string type to binary type which is compatible with the UUID type in PostgreSQL.

This is completely wrong. Answer looks like chat gpt generated. Chat gpt always provide answers with jdbc connector type. Author is asking about PostgreSQL connector which is better than jdbc.

uuid(id) function don't accept arguments. You can do uuid() and it will generate random guid which will be of type string in dataframe. You provided function is just what chat gpt made up.

noname123
New Contributor III

Had same problem. You need to add option "stringtype": "unspecified".

Example:

df.write.format("postgresql").mode("overwrite").option("truncate", "true").option("stringtype", "unspecified").option("dbtable", table).option("host", host).option("database", database).option("user", username).option("password", password).save()

Student-Learn
New Contributor II

Yes, this stack overflow was my reference too and adding below option made load go with no error on UUID data type in postgres column

Spoiler
.option(stringtype, "unspecified")

https://stackoverflow.com/questions/40973909/how-to-write-into-postgresql-hstore-using-spark-dataset...

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.