cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to create a csv using a Scala notebook that as " in some columns?

tarente
New Contributor III

In a project we use Azure Databricks to create csv files to be loaded in ThoughtSpot.

Below is a sample to the code I use to write the file:

val fileRepartition = 1
val fileFormat = "csv"
val fileSaveMode = "overwrite"
var fileOptions = Map (
                        "header" -> "true",
                        "overwriteSchema" -> "true",
                        "delimiter" -> "\t"
                      )
 
dfFinal
  .repartition (fileRepartition.toInt)
  .write
  .format  (fileFormat)
  .mode    (fileSaveMode)
  .options (fileOptions)
  .save    (filePath)

The csv created uses a tab as the column separator and some of the columns may have " in their values. When that happens in the csv file the value of that column is enclosed by ". E.g.:

ProductId	ProductCode	ProductDesc
1234	BD Plastipak	"BD Plastipak 1/4\" Syringes"

Is it possible to change the parameters to write the file as described below?

ProductId	ProductCode	ProductDesc
1234	BD Plastipak	BD Plastipak 1/4" Syringes

I have a workaround to do it in a sub-sequent step to use sed to update the csv, but it would be much easier if I were able to get the file in the correct format when saving it from the notebook.

Thanks in advance,

Tiago R.

1 ACCEPTED SOLUTION

Accepted Solutions

shan_chandra
Databricks Employee
Databricks Employee

could you please try adding - escape as an option while writing to a csv?

Please refer to the below additional options available during writing to a CSV - under CSV-specific option(s) for writing CSV files.

View solution in original post

2 REPLIES 2

shan_chandra
Databricks Employee
Databricks Employee

could you please try adding - escape as an option while writing to a csv?

Please refer to the below additional options available during writing to a CSV - under CSV-specific option(s) for writing CSV files.

tarente
New Contributor III

Hi Shan,

Thanks for the link.

I now know more options for creating different csv files.

I have not yet completed the problem, but that is related with a destination application (ThoughtSpot) not being able to load the data in the csv file correctly.

Regards,

Tiago R.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group