cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to convert column type from str to date in sparksql when the format is not yyyy-mm-dd?

semihcandoken
New Contributor

I imported a large csv file into databricks as a table.

I am able to run sql queries on it in a databricks notebook.

In my table, I have a column that contains date information in the mm/dd/yyyy format :

12/29/2015

12/30/2015 etc...

Databricks imported this column with type str, instead of date. Forcing a 'timestamp' type in the Table UI did not have any effect.

How can I convert this column type to a date inside sql?

I tried to do

select cast(arrival_date as date) from my_data_table

however, this requires that the str column is in YYYY-mm-dd format. And mine is mm/dd/yyyy format as mentioned above.

select to_date('15/1/09') as date;

does not work either for the same reason.

What can I do to have a column of dates?

Is it true that SparkSQL does not support 'update' operations? In that case, I cannot rearrange the string to fit the format either? What options do I have?

4 REPLIES 4

JoeConley
New Contributor II

The following worked for me:

df.withColumn("tx_date", to_date(unix_timestamp($"date", "M/dd/yyyy").cast("timestamp")))

Bill_Chambers
Contributor II

also to_date($"date", "format") works.

shalli
New Contributor II

I tried the above problem with a string and solved it as below:

val df = sc.parallelize(Seq("08-26-2016")).toDF("Id")

df.createOrReplaceTempView("table1")

val bdf = spark.sql("""select from_unixtime(unix_timestamp(Id, 'MM-dd-yyyy')) as new_format from table1""")

bdf.printSchema

bdf.show

val bbdf = bdf.withColumn("dt",$"new_format".cast("date"))

bbdf.printSchema

bbdf.show

bbdf.select(year($"dt")).show

ShubhamGupta187
New Contributor II

@josephpconley would it be safe to cast a column that contains null values?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group