cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

JBDC RBMS Table Overwrite Transaction Incomplete

diegohMoodys
New Contributor

Spark version:  spark-3.4.1-bin-hadoop3

JBDC Driver: mysql-connector-j-8.4.0.jar

Assumptions:

  • have all the proper read/write permissions
  • dataset isn't large: ~2 million records
  • reading flat files, writing to a database
  • Does not read from the database at all

I call this transaction on a nohup
```

out.show() # perform a calculation

print("Writing to database")
 
jdbc_url = f"jdbc:mysql://{hostname}/{database}?rewriteBatchedStatements=true"
properties = {
   "user": ****,
   "password": ****,
   "driver": "com.mysql.jdbc.Driver"
}
out.write.jdbc(url=jdbc_url, table="story_count_by_entity_id", mode="overwrite", properties=properties)
print("DONE")

```
This just doesn't complete. The output clearly shows the data is correct,
```
|5011422399|2025-01-16 00:00:00|         1|         2|         5|         16|         48|         224|         562|   9|
|6001131375|2025-01-16 00:00:00|       353|       629|      2163|       9314|      18256|       23679|       53813|5229|
|2707170344|2025-01-16 00:00:00|        11|        23|        48|        293|       1728|        3113|        4169| 106|
|2838891055|2025-01-16 00:00:00|         3|         3|        29|         78|         98|         167|         350|  54|
|3784123049|2025-01-16 00:00:00|         6|        10|       113|        238|        472|        1076|        3119| 163|
+----------+-------------------+----------+----------+----------+-----------+-----------+------------+------------+----+
only showing top 20 rows

Writing to database
DONE
```
But the database shows incomplete data

diegohMoodys_0-1737041259601.png


Is there some way of raising an error or checking if the spark transaction failed? Any other suggestions of how to approach this?

1 REPLY 1

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @diegohMoodys,

Can you try in debug mode?

spark.sparkContext.setLogLevel("DEBUG")

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now