cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to write *.csv file from DataBricks FileStore

Jeff1
Contributor II

Struggling with how to export a Spark dataframe as a *.csv file to a local computer. I'm successfully using the spark_write_csv funciton (sparklyr R library R) to write the csv file out to my databricks dbfs:FileStore location. Becase (I'm assuming) databricks is creating 4 *.csv partitions. So I gather from the databricks documentation I need to coalesce the partitions files . So I'm am using the following command....

df.coalesce(1).write.option("header","true").csv("dbfs:FileStore/temp/df.csv")

And then I receive a NameError: name 'df' is not defined.

So am I missing a step, syntax wrong? I'm working in a R Notebook

Jeff

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

sparklyr has a different syntax. There is function sdf_coalesce.

The code which you paste is for Scala/Python. Additionally, even in python you can only specify folder not file so CSV("dbfs:FileStore/temp/")

View solution in original post

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

sparklyr has a different syntax. There is function sdf_coalesce.

The code which you paste is for Scala/Python. Additionally, even in python you can only specify folder not file so CSV("dbfs:FileStore/temp/")

Ok that helped. I was able to use the sdf_coalesce function and have 1 partition. Then how do I download it from DataBrick. I t provides a path and I would assume I have to combine it with some other portion of an html command

Hubert-Dudek
Esteemed Contributor III

@Jeff (Customer),

  • Mount your blob storage (or s3) to databricks and save it there. Then you can get it from a browser or using an app like storage explorer.
  • OR use display(your_dataframe) function - there is export option

Yes. Still don't understand why I couldn't use the python code after %python. But never the less Mr. Dudek response corrected my course. I also was able to download the file to my local drive after playing with the html path. The databricks documentation was not helpful, or maybe I just didn't find the correct help document.

jose_gonzalez
Databricks Employee
Databricks Employee

Hi @Jeff Reichmanโ€‹,

Once you saved the file to FileStore https://docs.databricks.com/data/filestore.html#save-a-file-to-filestore then you can follow the instructions from here to be able to download it to your local machine

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group