Databricks Community

gianni77 · ‎07-29-2015

The "Download CSV" button in the notebook seems to work only for results <=1000 entries. How can I export larger result-sets as CSV?

User16301467534 · ‎07-31-2015

Hi,

You can write the results back to S3 and download them from directly from S3. Our DB Guide notebooks should have a sample for you to look at. Please search for Download All SQL and you should see Scala and python examples.

PaulLintilhac · ‎08-24-2017

It woul be really great if you could add a simple option to download a table in the "data" tab. As a data scientist, I can say this is a pretty basic part of the data science process -- for example if you are building documentation or whitepapers for a model, you will need samples of data, correlation matrices, etc.

Anonymous · ‎08-28-2017

You can also write to dbfs:/FileStore/download.csv, and then download by going to https://community.cloud.databricks.com/files/download.csv

jiaxuan · ‎05-29-2019

Hi doug,

Can you provide more details on this? Not sure how to write to the filestore.

Thanks.

pepevo · ‎02-18-2020

yes, would you please provide more in detail on this? the link is not working (HTTP ERROR: 404" thank you

Anonymous · ‎06-10-2020

To write to the filestore you'd do:

df.write.option("header", "true").csv("dbfs:/FileStore/download.csv") and then download from

https://community.cloud.databricks.com/files/download.csv

However, Databricks now has the feature to just click "Download" when displaying a result-set, which the other answers below also mention. It will download the first 1 million rows.

pepevo · ‎06-03-2020

I have asked this question before since the link is broken.

How do I export sql query into csv and send email out. Also, how can I select a billion records in csv. Databrick allows to export only max a million record but we have 1.5 billion

ColbyCarrillo · ‎06-10-2020

Hello,

I usually use pyspark to do such action. Some example code of how I write it to DBFS from a pyspark data frame:

data_frame.coalesce(1).write.format("com.databricks.spark.csv").save("/FileStore/tables/your_file_name.csv")

If you are curious how to transfer from straight SQL to pyspark, you can try creating a temporary table within SQL and then reading it into a pyspark data frame to then write it out. Below code is given you have imported the respective libraries.

%python

data_frame = spark.table("your_temp_table")

Please note I normally do not have export data of this size. You may need to research more the coalesce function and may be required to combine multiple csv files.

ColbyCarrillo · ‎06-10-2020

After rereading your question, this is quite simple, when downloading a csv from the notebook there will be a down arrow indicator on the right side of the symbol. All you need to do is click that drop down and click download full results (1,000,000 max)

pepevo · ‎06-10-2020

Cobly,

I know how to download a csv from a notebook via using a down arrow indicator on the bottom of the query. But I want to know how to create a script to pull the csv directly send out email and/or store directly in the indicate folder like oracle/mysql. Also, the query is about 1.5 billion records, 1 million max is not an option.

Thank you.

Bach-Nga

AlvinHoffman · ‎02-04-2021

You can compose the outcomes back to S3 and download them from straightforwardly from S3. Our DB Guide note pads ought to have an example for you to take a gander at. If it's not too much trouble, look for Download All SQL and you should see Scala and python models.

Anonymous · ‎03-31-2022

Now within the Databricks SQL interface (within the SQL editor) you can actually download the full results as a csv. Just make sure to uncheck "LIMIT 1000" and then click the download button under "..." in the bottom left: Screenshot of download button location

igorstar · ‎05-29-2024

If you have a large dataset, you might want to export it to a bucket in parquet format from your notebook:

%python

df = spark.sql("select * from your_table_name")
df.write.parquet(your_s3_path)

Databricks Community

How can I export a result of a SQL query from a databricks notebook?

Photos

Connect with Databricks Users in Your Area

Securely share data, analytics and AI

Data Intelligence for Data Engineers

Databricks Learning Festival (Virtual): 15 January - 31 January 2025