Re: data frame takes unusually long time to write ...

elgeo · ‎11-10-2022

Hello. We face exactly the same issue. Reading is quick but writing takes long time. Just to clarify that it is about a table with only 700k rows. Any suggestions please? Thank you

remote_table = spark.read.format ( "jdbc" ) \

.option ( "driver" , "com.ibm.as400.access.AS400JDBCDriver") \

.option ( "url" , "url") \

.option ( "dbtable" , "table_name") \

.option ( "partitionColumn" , "ID") \

.option ( "lowerBound" , "0") \

.option ( "upperBound" , "700000") \

.option ( "numPartitions" , "1000") \

.option ( "user" , "user") \

.option ( "password" , "pass") \

.load ()

remote_table.write.format("delta").mode("overwrite") \

.option("overwriteSchema", "true") \

.partitionBy("ID") \

.saveAsTable("table_name")