Databricks Community

tarente · ‎09-18-2021

In a project we use Azure Databricks to create csv files to be loaded in ThoughtSpot.

Below is a sample to the code I use to write the file:

val fileRepartition = 1
val fileFormat = "csv"
val fileSaveMode = "overwrite"
var fileOptions = Map (
                        "header" -> "true",
                        "overwriteSchema" -> "true",
                        "delimiter" -> "\t"
                      )
 
dfFinal
  .repartition (fileRepartition.toInt)
  .write
  .format  (fileFormat)
  .mode    (fileSaveMode)
  .options (fileOptions)
  .save    (filePath)

The csv created uses a tab as the column separator and some of the columns may have " in their values. When that happens in the csv file the value of that column is enclosed by ". E.g.:

ProductId	ProductCode	ProductDesc
1234	BD Plastipak	"BD Plastipak 1/4\" Syringes"

Is it possible to change the parameters to write the file as described below?

ProductId	ProductCode	ProductDesc
1234	BD Plastipak	BD Plastipak 1/4" Syringes

I have a workaround to do it in a sub-sequent step to use sed to update the csv, but it would be much easier if I were able to get the file in the correct format when saving it from the notebook.

Thanks in advance,

Tiago R.

shan_chandra · ‎09-18-2021

could you please try adding - escape as an option while writing to a csv?

Please refer to the below additional options available during writing to a CSV - under CSV-specific option(s) for writing CSV files.

View solution in original post

shan_chandra · ‎09-18-2021

could you please try adding - escape as an option while writing to a csv?

Please refer to the below additional options available during writing to a CSV - under CSV-specific option(s) for writing CSV files.

tarente · ‎09-21-2021

Hi Shan,

Thanks for the link.

I now know more options for creating different csv files.

I have not yet completed the problem, but that is related with a destination application (ThoughtSpot) not being able to load the data in the csv file correctly.

Regards,

Tiago R.