cgrant
Databricks Employee
Databricks Employee

Databricks has a special DBIO protocol that uses the _started and _committed files to transactionally write to cloud storage.

You can disable this by setting the below spark config

spark.conf.set("spark.sql.sources.commitProtocolClass", "org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol")

Also, you can read more about DBIO here

View solution in original post