cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot Convert Column to Bool Error - When Converting dataframe column which is in string to date type in python

desai_n_3
New Contributor II

Hi All,

I am trying to convert a dataframe column which is in the format of string to date type format yyyy-MM-DD?

I have written a sql query and stored it in dataframe.

df3 = sqlContext.sql(sqlString2)
df3.withColumn(df3['CalDay'],pd.to_datetime(df3['CalDay'],format = '%Y%m%d'))

I get this error

ValueError: Cannot convert column into bool: please use '&' for 'and', '|' for 'or', '~' for 'not' when building DataFrame boolean expressions.

How is this happening?

6 REPLIES 6

shyam_9
Valued Contributor
Valued Contributor

Hi @desai.n.3, In scala please use coalesce function to convert Date format,

df.withColumn("Date", coalesce(
  to_timestamp($"Date", "yyyyMMdd"))).show<br>

desai_n_3
New Contributor II

Hi Thanks for the reply,

It says coalesce fucntion not defined, plus gives invalid syntax with $

shyam_9
Valued Contributor
Valued Contributor

If you are using the Databricks notebook the above code works and for Pandas use below code,

df['Date']= pd.to_datetime(df['Date'])

JoshuaJames
New Contributor II

You gave him scala code, his code is obviously python

"Databricks notebooks" can be python, scala even SQL so my advice is check the question and answer in the language they are using at least (both if you know it for future people)

JoshuaJames
New Contributor II

Registered to post this so forgive the formatting nightmare

This is a python databricks script function that allows you to convert from string to datetime or date and utilising coalesce

from pyspark.sql.functions import coalesce, to_date

def to_datetime_(col, format): # Spark 2.2 or later syntax, for < 2.2 use unix_timestamp and cast return coalesce(to_date(col, format))

df3 = sqlContext.sql(sqlString2)df3.withColumn(df3['CalDay'],pd.to_datetime_(df3['CalDay'],format = '%Y%m%d'))

the resulting comment is looking nothing like it does in edit mode

this is the function

def to_datetime_(col, format): 
# Spark 2.2 or later syntax, for < 2.2 use unix_timestamp and cast 
    return coalesce(to_date(col, format))

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.