cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Differences between lit(None) or lit(None).cast('string')

SaraCorralLou
New Contributor III

I want to define a column with null values in my dataframe using pyspark. This column will later be used for other calculations.

What is the difference between creating it in these two different ways?

  • df.withColumn("New_Column", lit(None))
  • df.withColumn("New_Column", lit(None).cast('string'))

Can both be used? Is there a wrong one?

Thank you so much!

1 ACCEPTED SOLUTION

Accepted Solutions

Murthy1
Contributor II

Hello!

An elegant way of defining an empty column in a dataframe is to mention as

df.withColumn("New_Column", lit(None).cast(StringType()))

If you are just working with dataframes ( and no file formats are involved) you can also work with NullType().

View solution in original post

2 REPLIES 2

Murthy1
Contributor II

Hello!

An elegant way of defining an empty column in a dataframe is to mention as

df.withColumn("New_Column", lit(None).cast(StringType()))

If you are just working with dataframes ( and no file formats are involved) you can also work with NullType().

Anonymous
Not applicable

Hi @Sara Corral​ 

Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.

Please help us select the best solution by clicking on "Select As Best" if it does.

Your feedback will help us ensure that we are providing the best possible service to you. Thank you!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.