Databricks Community

soumiknow · ‎12-10-2024

We have a date (DD/MM/YYYY) partitioned BQ table. We want to update a specific partition data in 'overwrite' mode using PySpark. So to do this, I applied 'spark.sql.sources.partitionOverwriteMode' to 'DYNAMIC' as per the spark bq connector documentation. But still it deleted the other partitioned data which should not be happening.

df_with_partition.write.format("bigquery") \
                .option("table", f"{bq_table_full}") \
                .option("partitionField", f"{partition_date}") \
                .option("partitionType", f"{bq_partition_type}") \
                .option("temporaryGcsBucket", f"{temp_gcs_bucket}") \
                .option("spark.sql.sources.partitionOverwriteMode", "DYNAMIC") \
                .option("writeMethod", "indirect") \
                .mode("overwrite") \
                .save()

Can anyone please suggest me what I am doing wrong or how to implement this dynamic partitionOverwriteMode. Many thanks.
#pyspark #overwrite #partition #dynamic #bigquery

VZLA · ‎01-08-2025

@soumiknow ,

Just checking if there are any further questions, and did my last comment help?

soumiknow · ‎03-13-2025

Issue got resolved with DRB 16.1. Many thanks to Support Team.

ambar2595 · ‎05-05-2025

I'm using DBR 16.3 and all partitions are still being deleted. This is the code I'm using. No success.

spark = (
        SparkSession.builder.config("spark.datasource.bigquery.intermediateFormat", "orc")
        .config("spark.sql.sources.partitionOverwriteMode", "dynamic")
        .getOrCreate()
    )
visiting_client_day = (
        spark.read.format("delta")
        .load("s3://bucket-2/gold/visiting_client_day")
        .where(col("date_utc") == lit("2025-05-04"))
    )
(
        visiting_client_day.write.format("bigquery")
        .option("parentProject", "parentProject")
        .option("project", "project")
        .option("temporaryGcsBucket", "bucket")
        .mode("overwrite")
        .option("table", FINAL_TABLE)
        .save()
    )

soumiknow · ‎05-05-2025

Hi @ambar2595 ,

Please could you try adding 'writeMedthod' option with 'indirect' value.

option("writeMethod", "indirect")

ambar2595 · ‎05-06-2025

According to the documentation, this is the default value. https://github.com/GoogleCloudDataproc/spark-bigquery-connector/blob/master/README.md

and I just tried it and it didn't work. 😞

soumiknow · ‎05-06-2025

Yes, agreed. Give it a try for once. If not worked, then this issue introduced with DBR 16.3. Earlier DBR 15.4 LTS had the issue which got fixed in DBR 16.1.

ambar2595 · ‎05-06-2025

It didn't worked with DBR 16.1 as well.

soumiknow · ‎05-07-2025

I am still using DBR 16.1 and the partitionOverwriteMode with 'DYNAMIC' value is working for me. Today itself I rechecked the workflow.

Databricks Community

BQ partition data deleted fully even though 'spark.sql.sources.partitionOverwriteMode' is DYNAMIC

Join Us as a Local Community Builder!

Lakehouse, Lagers & Legends — Bangalore Meetup | December 13

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐