cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Query results in csv file include 'null' string for blank cell

AlexG
New Contributor III

After running a sql script, when downloading the results to a csv file, the file includes a null string for blank cells (see screenshot). Is ther a setting I can change to simply get empty cells instead?

AlexG_1-1702927614092.png

 

5 REPLIES 5

jose_gonzalez
Databricks Employee
Databricks Employee

You need to use "emptyValue" when writing you data

df.write.csv(PATH, header=True, emptyValue='')

  Here is a list of APIs references https://spark.apache.org/docs/latest/sql-data-sources-csv.html

AlexG
New Contributor III

Thank you. How would you download to csv the results from a SQL script without the null strings in empty cells?

AlexG_1-1714142384717.png

I mainly work with SQL notebooks

 

 

 

NandiniN
Databricks Employee
Databricks Employee

Hi AlexG,

I tested with the table content containing null and with empty data and it works as expected in the download option too.

Here is an eg:

CREATE TABLE my_table_null_test1 (
id INT,
name STRING
);

INSERT INTO my_table_null_test1 (id, name) VALUES
(1, 'John Doe'),
(2, NULL),
(3, 'Jane Doe'),
(NULL, 'Anonymous')
(4, '');
Screenshot 2024-05-01 at 11.31.00.png

If you do not want nulls, you may have to modify the query to use 

coalesce(NULL, '') or
nvl(NULL, '');

 Hope it helps. Thanks!

Gabriel2
New Contributor II

Also running into this, for fields with date datatype

NULL values end up with "null" strings in the downloaded csv file along with actual dates. This causes import problems when trying to load into other systems and the field is defined as date. 

only solution is to cast null to a default date value like - '1900-01-01' in the query but this is not ideal. Would love a better built-in solution!

NandiniN
Databricks Employee
Databricks Employee

I understand, however this is more on CSV file format. 

Save your data in Delta format instead of CSV or text-based formats. Delta tables handle empty strings and NULL values more effectively, ensuring that empty strings are preserved during data insertion.

https://kb.databricks.com/en_US/data/empty-string-values-convert-to-null-values-when-saving-a-table-...

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group