How to convert column type from str to date in sparksql when the format is not yyyy-mm-dd?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-18-2016 09:29 PM
I imported a large csv file into databricks as a table.
I am able to run sql queries on it in a databricks notebook.
In my table, I have a column that contains date information in the mm/dd/yyyy format :
12/29/2015
12/30/2015 etc...
Databricks imported this column with type str, instead of date. Forcing a 'timestamp' type in the Table UI did not have any effect.
How can I convert this column type to a date inside sql?
I tried to do
select cast(arrival_date as date) from my_data_table
however, this requires that the str column is in YYYY-mm-dd format. And mine is mm/dd/yyyy format as mentioned above.
select to_date('15/1/09') as date;
does not work either for the same reason.
What can I do to have a column of dates?
Is it true that SparkSQL does not support 'update' operations? In that case, I cannot rearrange the string to fit the format either? What options do I have?
- Labels:
-
SQL
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2017 12:01 PM
The following worked for me:
df.withColumn("tx_date", to_date(unix_timestamp($"date", "M/dd/yyyy").cast("timestamp")))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2017 09:12 AM
also to_date($"date", "format") works.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-26-2018 06:52 AM
I tried the above problem with a string and solved it as below:
val df = sc.parallelize(Seq("08-26-2016")).toDF("Id")
df.createOrReplaceTempView("table1")
val bdf = spark.sql("""select from_unixtime(unix_timestamp(Id, 'MM-dd-yyyy')) as new_format from table1""")
bdf.printSchema
bdf.show
val bbdf = bdf.withColumn("dt",$"new_format".cast("date"))
bbdf.printSchema
bbdf.show
bbdf.select(year($"dt")).show
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-19-2018 09:37 PM
@josephpconley would it be safe to cast a column that contains null values?

