cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Parquet to csv delta file

uv
New Contributor II

Hi Team,

I have a parquet file in s3 bucket which is a delta file I am able to read it but I am unable to write it as a csv file.

getting the following error when i am trying to write:

A transaction log for Databricks Delta was found at `s3://path/abc/_delta_log`,

but you are trying to write to `s3://path/abc/` using format("csv"). You must use

'format("delta")' when reading and writing to a delta table.

I am using this method to write to csv

abc.write.format("delta").mode("overwrite").options(delimiter="|").csv(destinationBucketPath)

let me know if I need to change anything.

3 REPLIES 3

Aviral-Bhardwaj
Esteemed Contributor III

please share exact code by that we can also replicate this thing

Anonymous
Not applicable

@yuvesh kotiala​ :

The error message suggests that you need to use format("delta") instead of format("csv") when reading and writing to a Delta table. In your code, you are trying to write a Delta file as a CSV file, which is causing the error. If you want to write the data to a CSV file, you can first read the Delta file as a dataframe and then write it as a CSV file. Here's an example:

from pyspark.sql import SparkSession
 
# create a SparkSession
spark = SparkSession.builder.appName("DeltaToCSV").getOrCreate()
 
# read the Delta file as a dataframe
delta_df = spark.read.format("delta").load("s3://path/abc/")
 
# write the dataframe as a CSV file
delta_df.write.format("csv").mode("overwrite").options(delimiter="|").save(destinationBucketPath)

Note that when you read the Delta file, you need to use format("delta") and load() instead of csv() as in your original code. This will read the Delta file as a dataframe, which can then be written as a CSV file using format("csv") and save().

Anonymous
Not applicable

Hi @yuvesh kotiala​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.