cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Differences between lit(None) or lit(None).cast('string')

SaraCorralLou
New Contributor III

I want to define a column with null values in my dataframe using pyspark. This column will later be used for other calculations.

What is the difference between creating it in these two different ways?

  • df.withColumn("New_Column", lit(None))
  • df.withColumn("New_Column", lit(None).cast('string'))

Can both be used? Is there a wrong one?

Thank you so much!

1 ACCEPTED SOLUTION

Accepted Solutions

Murthy1
Contributor II

Hello!

An elegant way of defining an empty column in a dataframe is to mention as

df.withColumn("New_Column", lit(None).cast(StringType()))

If you are just working with dataframes ( and no file formats are involved) you can also work with NullType().

View solution in original post

3 REPLIES 3

Murthy1
Contributor II

Hello!

An elegant way of defining an empty column in a dataframe is to mention as

df.withColumn("New_Column", lit(None).cast(StringType()))

If you are just working with dataframes ( and no file formats are involved) you can also work with NullType().

Anonymous
Not applicable

Hi @Sara Corral​ 

Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.

Please help us select the best solution by clicking on "Select As Best" if it does.

Your feedback will help us ensure that we are providing the best possible service to you. Thank you!

shadowinc
New Contributor III

For me 

df.withColumn("New_Column", lit(None).cast(StringType()))

this didn't work.

I used this instead

df.withColumn("New_Column", lit(null).cast(StringType))

 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group