cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Notebook dataframe loading duplicate data in SQL table

Priya_Mani
New Contributor II

Hi, I am trying to load data from datalake into SQL table using "SourceDataFrame.write" operation in a Notebook using apache spark.

This seems to be loading duplicates at random times. The logs don't give much information and I am not sure what else to look for. How can I investigate and find the root cause for this. Please let me know what more information I can provide for anyone to help.

Thanks!

3 REPLIES 3

-werners-
Esteemed Contributor III

can you elaborate a bit more on this notebook?

And also what databricks runtime version?

hi @Werner Stinckensโ€‹ , This is a Apache spark notebook, which reads the contents of a file stored in Azure blob and loads into an on prem SQL table.

Databricks Runtime is 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12) with a Standard_DS3_v2 worker-driver type

The notebook reads the file content using below code

val SourceDataFrame = spark 
            .read 
            .option("header","false") 
            .option("delimiter", "|") 
            .schema(SourceSchemaStruct) 
            .csv(SourceFilename)

Then it writes the dataframe into a table with an overwrite mode

SourceDataFrame2
      .write
      .format("jdbc")
      .mode("overwrite")
      .option("driver", driverClass)
      .option("url", jdbcUrl)
      .option("dbtable", TargetTable)
      .option("user", jdbcUsername)
      .option("password", jdbcPassword)
      .save()

-werners-
Esteemed Contributor III

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group