Incompatible format detected while writing in Parquet format.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ10-20-2022 12:42 PM
I am writing/reading data from Azure databricks to data lake. I wrote dataframe to a path in delta format using query a below, later I realized that I need the data in parquet format, and I went to the storage account and manually deleted the filepath. Now, when I try to execute query b it always throwing an error c below. I am pretty sure the filepath now does not exists on the storage because I manually deleted it. What is missing here, is this some kind of bug? Thanks in advance!
a) df.coalesce(1).write.format('delta').mode('overwrite').option('overwriteSchema', 'true').save(filepath)
b) df.coalesce(1).write.format('parquet').mode('overwrite').option('overwriteSchema', 'true').save(filepath)
c) AnalysisException: Incompatible format detected.
A transaction log for Databricks Delta was found at `filepath_delta_log`,
but you are trying to write to `filepath` using format("parquet"). You must use
'format("delta")' when reading and writing to a delta table.
- Labels:
-
Azure
-
Azure databricks
-
Parquet Format
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ10-20-2022 01:50 PM
Update: I tried Clear state and outputs which did not help, but when I restarted the cluster it worked without an issue. Though the issue is fixed, I still don't know what caused the issue to come in.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ10-24-2022 11:19 AM
Hi @Kris Koiralaโ,
Thank you for your reply. If you would like to find the RCA of this issue, please go to you driver logs and download the log4j, stdout and stderr logs. These log files will help you to narrow down the RCA and the reason why the error was happening.

