Parquet to csv delta file
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-26-2023 08:51 PM
Hi Team,
I have a parquet file in s3 bucket which is a delta file I am able to read it but I am unable to write it as a csv file.
getting the following error when i am trying to write:
A transaction log for Databricks Delta was found at `s3://path/abc/_delta_log`,
but you are trying to write to `s3://path/abc/` using format("csv"). You must use
'format("delta")' when reading and writing to a delta table.
I am using this method to write to csv
abc.write.format("delta").mode("overwrite").options(delimiter="|").csv(destinationBucketPath)
let me know if I need to change anything.
- Labels:
-
CSV File
-
Databricks delta
-
File
-
Parquet
-
Parquet File
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-27-2023 04:59 AM
please share exact code by that we can also replicate this thing
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-01-2023 10:25 PM
@yuvesh kotiala :
The error message suggests that you need to use format("delta") instead of format("csv") when reading and writing to a Delta table. In your code, you are trying to write a Delta file as a CSV file, which is causing the error. If you want to write the data to a CSV file, you can first read the Delta file as a dataframe and then write it as a CSV file. Here's an example:
from pyspark.sql import SparkSession
# create a SparkSession
spark = SparkSession.builder.appName("DeltaToCSV").getOrCreate()
# read the Delta file as a dataframe
delta_df = spark.read.format("delta").load("s3://path/abc/")
# write the dataframe as a CSV file
delta_df.write.format("csv").mode("overwrite").options(delimiter="|").save(destinationBucketPath)
Note that when you read the Delta file, you need to use format("delta") and load() instead of csv() as in your original code. This will read the Delta file as a dataframe, which can then be written as a CSV file using format("csv") and save().
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-03-2023 11:40 PM
Hi @yuvesh kotiala
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!