cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Query results in csv file include 'null' string for blank cell

AlexG
New Contributor III

After running a sql script, when downloading the results to a csv file, the file includes a null string for blank cells (see screenshot). Is ther a setting I can change to simply get empty cells instead?

AlexG_1-1702927614092.png

 

5 REPLIES 5

jose_gonzalez
Databricks Employee
Databricks Employee

You need to use "emptyValue" when writing you data

df.write.csv(PATH, header=True, emptyValue='')

  Here is a list of APIs references https://spark.apache.org/docs/latest/sql-data-sources-csv.html

AlexG
New Contributor III

Thank you. How would you download to csv the results from a SQL script without the null strings in empty cells?

AlexG_1-1714142384717.png

I mainly work with SQL notebooks

 

 

 

NandiniN
Databricks Employee
Databricks Employee

Hi AlexG,

I tested with the table content containing null and with empty data and it works as expected in the download option too.

Here is an eg:

CREATE TABLE my_table_null_test1 (
id INT,
name STRING
);

INSERT INTO my_table_null_test1 (id, name) VALUES
(1, 'John Doe'),
(2, NULL),
(3, 'Jane Doe'),
(NULL, 'Anonymous')
(4, '');
Screenshot 2024-05-01 at 11.31.00.png

If you do not want nulls, you may have to modify the query to use 

coalesce(NULL, '') or
nvl(NULL, '');

 Hope it helps. Thanks!

Gabriel2
New Contributor II

Also running into this, for fields with date datatype

NULL values end up with "null" strings in the downloaded csv file along with actual dates. This causes import problems when trying to load into other systems and the field is defined as date. 

only solution is to cast null to a default date value like - '1900-01-01' in the query but this is not ideal. Would love a better built-in solution!

NandiniN
Databricks Employee
Databricks Employee

I understand, however this is more on CSV file format. 

Save your data in Delta format instead of CSV or text-based formats. Delta tables handle empty strings and NULL values more effectively, ensuring that empty strings are preserved during data insertion.

https://kb.databricks.com/en_US/data/empty-string-values-convert-to-null-values-when-saving-a-table-...