<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic how read through jdbc from postgres to databricks a particular data type in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-read-through-jdbc-from-postgres-to-databricks-a-particular/m-p/109455#M43324</link>
    <description>&lt;P&gt;&lt;STRONG&gt;Hi Community,&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;I need to load data from PostgreSQL into Databricks through JDBC without changing the data type of a VARCHAR[]column in PostgreSQL, which should remain as an array of strings in Databricks.&lt;/P&gt;&lt;P&gt;Previously, I used psycopg2, and it worked, but I had to first convert the data in Pandas and then in Spark before using it in Databricks. I’d like to switch to JDBC for a more direct approach.&lt;/P&gt;&lt;P&gt;However, when I load the data, the VARCHAR[] column appears as a string in the format:&lt;BR /&gt;"{LIKE, COME, "LIKE_COME"}&lt;BR /&gt;Similarly, arrays of BIGINT appear as:&lt;BR /&gt;"{45,46}".&lt;/P&gt;&lt;P&gt;I understand that PostgreSQL array types are not natively supported in Databricks, but what steps should I take to retain the same data type after loading via JDBC?&lt;/P&gt;&lt;P&gt;Any guidance would be appreciated!&lt;/P&gt;</description>
    <pubDate>Fri, 07 Feb 2025 22:59:03 GMT</pubDate>
    <dc:creator>jeremy98</dc:creator>
    <dc:date>2025-02-07T22:59:03Z</dc:date>
    <item>
      <title>how read through jdbc from postgres to databricks a particular data type</title>
      <link>https://community.databricks.com/t5/data-engineering/how-read-through-jdbc-from-postgres-to-databricks-a-particular/m-p/109455#M43324</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Hi Community,&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;I need to load data from PostgreSQL into Databricks through JDBC without changing the data type of a VARCHAR[]column in PostgreSQL, which should remain as an array of strings in Databricks.&lt;/P&gt;&lt;P&gt;Previously, I used psycopg2, and it worked, but I had to first convert the data in Pandas and then in Spark before using it in Databricks. I’d like to switch to JDBC for a more direct approach.&lt;/P&gt;&lt;P&gt;However, when I load the data, the VARCHAR[] column appears as a string in the format:&lt;BR /&gt;"{LIKE, COME, "LIKE_COME"}&lt;BR /&gt;Similarly, arrays of BIGINT appear as:&lt;BR /&gt;"{45,46}".&lt;/P&gt;&lt;P&gt;I understand that PostgreSQL array types are not natively supported in Databricks, but what steps should I take to retain the same data type after loading via JDBC?&lt;/P&gt;&lt;P&gt;Any guidance would be appreciated!&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2025 22:59:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-read-through-jdbc-from-postgres-to-databricks-a-particular/m-p/109455#M43324</guid>
      <dc:creator>jeremy98</dc:creator>
      <dc:date>2025-02-07T22:59:03Z</dc:date>
    </item>
    <item>
      <title>Re: how read through jdbc from postgres to databricks a particular data type</title>
      <link>https://community.databricks.com/t5/data-engineering/how-read-through-jdbc-from-postgres-to-databricks-a-particular/m-p/109489#M43330</link>
      <description>&lt;P&gt;Hi community,&lt;/P&gt;&lt;P&gt;Yesterday, I found a solution. This is to query through jdbc from postgres creating two columns that are manageable in databricks. Here the code:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;   query = f"""(SELECT *, array_to_string(columns_to_export, ',') AS columns_to_export_string, 
      array_to_string(ups, ',') AS ups_to_string from {pg_table_name}) as data_to_fetch"""
    postgres_df = spark.read.format("jdbc") \
      .option("url", connection_properties["jdbc_url"]) \
      .option("dbtable", query) \
      .option("user", connection_properties["db_username"]) \
      .option("password", connection_properties["db_password"]) \
      .option("fetchsize", FETCH_SIZE) \
      .load() \
      .withColumn("columns_to_export", split(col("columns_to_export_string"), ",")) \
      .withColumn("ups", split(col("ups_to_string"), ",").cast("array&amp;lt;bigint&amp;gt;")) \
      .drop("columns_to_export_string", "ups_to_string")&lt;/LI-CODE&gt;</description>
      <pubDate>Sat, 08 Feb 2025 14:12:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-read-through-jdbc-from-postgres-to-databricks-a-particular/m-p/109489#M43330</guid>
      <dc:creator>jeremy98</dc:creator>
      <dc:date>2025-02-08T14:12:04Z</dc:date>
    </item>
  </channel>
</rss>

