In a project we use Azure Databricks to create csv files to be loaded in ThoughtSpot.
Below is a sample to the code I use to write the file:
val fileRepartition = 1
val fileFormat = "csv"
val fileSaveMode = "overwrite"
var fileOptions = Map (
"header" -> "true",
"overwriteSchema" -> "true",
"delimiter" -> "\t"
)
dfFinal
.repartition (fileRepartition.toInt)
.write
.format (fileFormat)
.mode (fileSaveMode)
.options (fileOptions)
.save (filePath)
The csv created uses a tab as the column separator and some of the columns may have " in their values. When that happens in the csv file the value of that column is enclosed by ". E.g.:
ProductId ProductCode ProductDesc
1234 BD Plastipak "BD Plastipak 1/4\" Syringes"
Is it possible to change the parameters to write the file as described below?
ProductId ProductCode ProductDesc
1234 BD Plastipak BD Plastipak 1/4" Syringes
I have a workaround to do it in a sub-sequent step to use sed to update the csv, but it would be much easier if I were able to get the file in the correct format when saving it from the notebook.
Thanks in advance,
Tiago R.