cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Parquet to csv delta file

uv
New Contributor II

Hi Team,

I have a parquet file in s3 bucket which is a delta file I am able to read it but I am unable to write it as a csv file.

โ€‹

getting the following error when i am trying to write:

โ€‹

A transaction log for Databricks Delta was found at `s3://path/abc/_delta_log`,

but you are trying to write to `s3://path/abc/` using format("csv"). You must use

'format("delta")' when reading and writing to a delta table.

โ€‹

I am using this method to write to csv

abc.write.format("delta").mode("overwrite").options(delimiter="|").csv(destinationBucketPath)

let me know if I need to change anything.

3 REPLIES 3

Aviral-Bhardwaj
Esteemed Contributor III

please share exact code by that we can also replicate this thing

AviralBhardwaj

Anonymous
Not applicable

@yuvesh kotialaโ€‹ :

The error message suggests that you need to use format("delta") instead of format("csv") when reading and writing to a Delta table. In your code, you are trying to write a Delta file as a CSV file, which is causing the error. If you want to write the data to a CSV file, you can first read the Delta file as a dataframe and then write it as a CSV file. Here's an example:

from pyspark.sql import SparkSession
 
# create a SparkSession
spark = SparkSession.builder.appName("DeltaToCSV").getOrCreate()
 
# read the Delta file as a dataframe
delta_df = spark.read.format("delta").load("s3://path/abc/")
 
# write the dataframe as a CSV file
delta_df.write.format("csv").mode("overwrite").options(delimiter="|").save(destinationBucketPath)

Note that when you read the Delta file, you need to use format("delta") and load() instead of csv() as in your original code. This will read the Delta file as a dataframe, which can then be written as a CSV file using format("csv") and save().

Anonymous
Not applicable

Hi @yuvesh kotialaโ€‹ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group