Splitting Date into Year, Month and Day, with inconsistent delimiters
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-04-2017 12:52 PM
I am trying to split my Date Column which is a String Type right now into 3 columns Year, Month and Date. I use (PySpark):
<code>split_date=pyspark.sql.functions.split(df['Date'], '-')
df= df.withColumn('Year', split_date.getItem(0))
df= df.withColumn('Month', split_date.getItem(1))
df= df.withColumn('Day', split_date.getItem(2))<br>
I run into an issue, because half my dates are separated by '-' and the other half are separated by '/'. How can I use and or operation to split the Date by either '-' or '/' depending on the use case. Additionaly, when its separated by '/', the format is mm/dd/yyyy and when separated by '-', the format is yyyy-mm-dd.
I want the Date column to be separated into Day, Month and Year.