cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Splitting Date into Year, Month and Day, with inconsistent delimiters

PranjalThapar
New Contributor

I am trying to split my Date Column which is a String Type right now into 3 columns Year, Month and Date. I use (PySpark):

<code>split_date=pyspark.sql.functions.split(df['Date'], '-')     
df= df.withColumn('Year', split_date.getItem(0))
df= df.withColumn('Month', split_date.getItem(1))
df= df.withColumn('Day', split_date.getItem(2))<br>

I run into an issue, because half my dates are separated by '-' and the other half are separated by '/'. How can I use and or operation to split the Date by either '-' or '/' depending on the use case. Additionaly, when its separated by '/', the format is mm/dd/yyyy and when separated by '-', the format is yyyy-mm-dd.

I want the Date column to be separated into Day, Month and Year.

4 REPLIES 4

Eve
New Contributor III

Try this 🙂 It works for me on string type date columns, holding something like this inside: 2016-05-02T18:28:15.790+0000

df = df1.select("some_id", year(df1["date"]).alias('year'), month(df1["date"]).alias('month'), dayofmonth(df1["date"]).alias('day'), hour(df1["date"]).alias('hour')).show()

Eve
New Contributor III

And in SCALA - assuming that df1 has a "date" column:

import org.apache.spark.sql.functions._ import org.apache.spark.sql.types._ import org.apache.spark.sql._

val df2 = df1.withColumn("year", year(col("date"))) .withColumn("month", month(col("date"))) .withColumn("day", dayofmonth(col("date"))) .withColumn("hour", hour(col("date")))

df2.show(Int.MaxValue)

youssefassouli
New Contributor II

thank you so much that was halpful

Eve
New Contributor III

Could you please mark it as an answer, if it was helpful? 🙂

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group