cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot Convert Column to Bool Error - When Converting dataframe column which is in string to date type in python

desai_n_3
New Contributor II

Hi All,

I am trying to convert a dataframe column which is in the format of string to date type format yyyy-MM-DD?

I have written a sql query and stored it in dataframe.

df3 = sqlContext.sql(sqlString2)
df3.withColumn(df3['CalDay'],pd.to_datetime(df3['CalDay'],format = '%Y%m%d'))

I get this error

ValueError: Cannot convert column into bool: please use '&' for 'and', '|' for 'or', '~' for 'not' when building DataFrame boolean expressions.

How is this happening?

6 REPLIES 6

shyam_9
Databricks Employee
Databricks Employee

Hi @desai.n.3, In scala please use coalesce function to convert Date format,

df.withColumn("Date", coalesce(
  to_timestamp($"Date", "yyyyMMdd"))).show<br>

desai_n_3
New Contributor II

Hi Thanks for the reply,

It says coalesce fucntion not defined, plus gives invalid syntax with $

shyam_9
Databricks Employee
Databricks Employee

If you are using the Databricks notebook the above code works and for Pandas use below code,

df['Date']= pd.to_datetime(df['Date'])

JoshuaJames
New Contributor II

You gave him scala code, his code is obviously python

"Databricks notebooks" can be python, scala even SQL so my advice is check the question and answer in the language they are using at least (both if you know it for future people)

JoshuaJames
New Contributor II

Registered to post this so forgive the formatting nightmare

This is a python databricks script function that allows you to convert from string to datetime or date and utilising coalesce

from pyspark.sql.functions import coalesce, to_date

def to_datetime_(col, format): # Spark 2.2 or later syntax, for < 2.2 use unix_timestamp and cast return coalesce(to_date(col, format))

df3 = sqlContext.sql(sqlString2)df3.withColumn(df3['CalDay'],pd.to_datetime_(df3['CalDay'],format = '%Y%m%d'))

the resulting comment is looking nothing like it does in edit mode

this is the function

def to_datetime_(col, format): 
# Spark 2.2 or later syntax, for < 2.2 use unix_timestamp and cast 
    return coalesce(to_date(col, format))

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group