10-21-2022 06:09 AM
Hi, I am trying to load data from datalake into SQL table using "SourceDataFrame.write" operation in a Notebook using apache spark.
This seems to be loading duplicates at random times. The logs don't give much information and I am not sure what else to look for. How can I investigate and find the root cause for this. Please let me know what more information I can provide for anyone to help.
Thanks!
10-24-2022 05:22 AM
can you elaborate a bit more on this notebook?
And also what databricks runtime version?
10-26-2022 03:13 AM
hi @Werner Stinckens , This is a Apache spark notebook, which reads the contents of a file stored in Azure blob and loads into an on prem SQL table.
Databricks Runtime is 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12) with a Standard_DS3_v2 worker-driver type
The notebook reads the file content using below code
val SourceDataFrame = spark
.read
.option("header","false")
.option("delimiter", "|")
.schema(SourceSchemaStruct)
.csv(SourceFilename)
Then it writes the dataframe into a table with an overwrite mode
SourceDataFrame2
.write
.format("jdbc")
.mode("overwrite")
.option("driver", driverClass)
.option("url", jdbcUrl)
.option("dbtable", TargetTable)
.option("user", jdbcUsername)
.option("password", jdbcPassword)
.save()
11-03-2022 02:21 AM
can you add the truncate option?
https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html
10-25-2022 03:57 AM
Hi @Priya Mani, We haven’t heard from you since the last response from @Werner Stinckens, and I was checking back to see if you have a resolution yet.
If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.
Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.