column "id" is of type uuid but expression is of type character varying.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-26-2023 04:56 AM
Hello,
I'm trying to write to Azure PostgreSQL flexible database from Azure Databricks, using PostgreSQL connector in Databricks Runtime in 12.2LTS.
I'm using df.write.format("postgresql").save() to write to PostgreSQL database, but getting the following error:
column "id" is of type uuid but expression is of type character varying.
df has an id column that contains a guid, but is of string type in dataframe and of type uuid in PG database.
As there is no guid or uuid type in Databricks, whet is the best way to insert into an uuid type column?
Thanks,
Kalyani
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-02-2023 04:02 PM
Hi @kp12 The error you are encountering is because the PostgreSQL UUID data type is not directly compatible with the string type in Spark DataFrame. You can try casting the string to binary type before writing to the database. Here is how you can do it:
python
from pyspark.sql.functions import expr
df = df.withColumn("id", expr("uuid(id)"))
df.write.format("jdbc").option("url", "jdbc:postgresql://<hostname>:<port>/<database>").option("dbtable", "<table>").option("user", "<username>").option("password", "<password>").save()
In the above code, replace <hostname>, <port>, <database>, <table>, <username>, and <password> with your actual PostgreSQL database details. Please note that you need to use "jdbc" as the format in the write method, not "postgresql". The "uuid()" function in the expr method is used to convert the string type to binary type which is compatible with the UUID type in PostgreSQL.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-25-2024 06:42 AM
This is completely wrong. Answer looks like chat gpt generated. Chat gpt always provide answers with jdbc connector type. Author is asking about PostgreSQL connector which is better than jdbc.
uuid(id) function don't accept arguments. You can do uuid() and it will generate random guid which will be of type string in dataframe. You provided function is just what chat gpt made up.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-25-2024 06:37 AM
Had same problem. You need to add option "stringtype": "unspecified".
Example:
df.write.format("postgresql").mode("overwrite").option("truncate", "true").option("stringtype", "unspecified").option("dbtable", table).option("host", host).option("database", database).option("user", username).option("password", password).save()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-15-2024 07:41 PM - edited 04-15-2024 07:43 PM
Yes, this stack overflow was my reference too and adding below option made load go with no error on UUID data type in postgres column

