cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Issue while writing data to unity catalog using JDBC

DumbBeaver
New Contributor II

While writing the data to a pre-existing table in the unity catalog using JDBC. it just writes the Delta of the data. 

Driver used: 

com.databricks:databricks-jdbc:2.6.36

Lets say I have the table has rows:

+-+-+
|a|b| 
+-+-+
|1|2|
|3|4|

 and I am appending the row using .union

+-+-+
|a|b|
+-+-+
|1|2|
+-+-+

I am using a `.union` here after reading the table with the row to append which results to this in my logs

+-+-+
|a|b| 
+-+-+
|1|2|
|3|4|
+-+-+

but after writing to table with mode ".overwrite" using jdbc

write_options = {"url": catalog_jdbc_url,
                         "driver": "com.databricks.client.jdbc.Driver"}
 (df.write.format("jdbc")
 .options(**write_options)
 .option("truncate", "true")
 .option("dbtable", table_name)
 .save(mode="overwrite"))

the table just shows in the unity catalog:

+-+-+
|a|b|
+-+-+
|1|2|
+-+-+

 Is there any reason for this or config issue?

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @DumbBeaverWhen writing data to a pre-existing table in the Unity Catalog using JDBC, itโ€™s essential to understand how the .union operation and the .overwrite mode work.

  1. Union Operation:

    • When you use .union to append rows to an existing DataFrame, it combines the rows from both DataFrames (the original table and the appended rows) without removing any duplicates.
    • In your case, you appended the row (1, 2) to the existing table, resulting in the combined DataFrame with both (1, 2) and (3, 4).
  2. Overwrite Mode:

    • When you write data to a table with the .overwrite mode, it replaces the entire contents of the table with the new data.
    • In your code snippet, you specified .option("truncate", "true"), which means that the table should be truncated (emptied) before writing the new data.
    • Since you appended only one row using .union, the table was truncated, and only the appended row (1, 2) remained.
  3. Possible Solutions:

    • If you want to keep both rows (1, 2) and (3, 4) in the table, consider using .mode("append") instead of .mode("overwrite").
    • If you intend to replace the entire table with just the appended row, ensure that the .union operation includes all existing rows along with the new row.

Remember to adjust your approach based on whether you want to keep the existing data or replace it entirely. The behavior you observed is consistent with the combination of .union and .overwrite as described above.

If you have any further questions or need additional assistance, feel free to ask! ๐Ÿ˜Š

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group