JBDC RBMS Table Overwrite Transaction Incomplete

diegohMoodys
New Contributor

Spark version:  spark-3.4.1-bin-hadoop3

JBDC Driver: mysql-connector-j-8.4.0.jar

Assumptions:

  • have all the proper read/write permissions
  • dataset isn't large: ~2 million records
  • reading flat files, writing to a database
  • Does not read from the database at all

I call this transaction on a nohup
```

out.show() # perform a calculation

print("Writing to database")
 
jdbc_url = f"jdbc:mysql://{hostname}/{database}?rewriteBatchedStatements=true"
properties = {
   "user": ****,
   "password": ****,
   "driver": "com.mysql.jdbc.Driver"
}
out.write.jdbc(url=jdbc_url, table="story_count_by_entity_id", mode="overwrite", properties=properties)
print("DONE")

```
This just doesn't complete. The output clearly shows the data is correct,
```
|5011422399|2025-01-16 00:00:00|         1|         2|         5|         16|         48|         224|         562|   9|
|6001131375|2025-01-16 00:00:00|       353|       629|      2163|       9314|      18256|       23679|       53813|5229|
|2707170344|2025-01-16 00:00:00|        11|        23|        48|        293|       1728|        3113|        4169| 106|
|2838891055|2025-01-16 00:00:00|         3|         3|        29|         78|         98|         167|         350|  54|
|3784123049|2025-01-16 00:00:00|         6|        10|       113|        238|        472|        1076|        3119| 163|
+----------+-------------------+----------+----------+----------+-----------+-----------+------------+------------+----+
only showing top 20 rows

Writing to database
DONE
```
But the database shows incomplete data

diegohMoodys_0-1737041259601.png


Is there some way of raising an error or checking if the spark transaction failed? Any other suggestions of how to approach this?