<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Splitting Date into Year, Month and Day, with inconsistent delimiters in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29274#M21018</link>
    <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I am trying to split my Date Column which is a String Type right now into 3 columns Year, Month and Date. I use (PySpark):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;&amp;lt;code&amp;gt;split_date=pyspark.sql.functions.split(df['Date'], '-')     
df= df.withColumn('Year', split_date.getItem(0))
df= df.withColumn('Month', split_date.getItem(1))
df= df.withColumn('Day', split_date.getItem(2))&amp;lt;br&amp;gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I run into an issue, because half my dates are separated by '-' and the other half are separated by '/'. How can I use and or operation to split the Date by either '-' or '/' depending on the use case. Additionaly, when its separated by '/', the format is mm/dd/yyyy and when separated by '-', the format is yyyy-mm-dd. &lt;/P&gt;
&lt;P&gt;I want the Date column to be separated into Day, Month and Year. &lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 04 May 2017 19:52:26 GMT</pubDate>
    <dc:creator>PranjalThapar</dc:creator>
    <dc:date>2017-05-04T19:52:26Z</dc:date>
    <item>
      <title>Splitting Date into Year, Month and Day, with inconsistent delimiters</title>
      <link>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29274#M21018</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I am trying to split my Date Column which is a String Type right now into 3 columns Year, Month and Date. I use (PySpark):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;&amp;lt;code&amp;gt;split_date=pyspark.sql.functions.split(df['Date'], '-')     
df= df.withColumn('Year', split_date.getItem(0))
df= df.withColumn('Month', split_date.getItem(1))
df= df.withColumn('Day', split_date.getItem(2))&amp;lt;br&amp;gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I run into an issue, because half my dates are separated by '-' and the other half are separated by '/'. How can I use and or operation to split the Date by either '-' or '/' depending on the use case. Additionaly, when its separated by '/', the format is mm/dd/yyyy and when separated by '-', the format is yyyy-mm-dd. &lt;/P&gt;
&lt;P&gt;I want the Date column to be separated into Day, Month and Year. &lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 May 2017 19:52:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29274#M21018</guid>
      <dc:creator>PranjalThapar</dc:creator>
      <dc:date>2017-05-04T19:52:26Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting Date into Year, Month and Day, with inconsistent delimiters</title>
      <link>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29275#M21019</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Try this &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; It works for me on string type date columns, holding something like this inside: 2016-05-02T18:28:15.790+0000&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;df = df1.select("some_id", year(df1["date"]).alias('year'), month(df1["date"]).alias('month'), dayofmonth(df1["date"]).alias('day'), hour(df1["date"]).alias('hour')).show()&lt;/CODE&gt;&lt;/PRE&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Feb 2019 14:20:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29275#M21019</guid>
      <dc:creator>Eve</dc:creator>
      <dc:date>2019-02-25T14:20:14Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting Date into Year, Month and Day, with inconsistent delimiters</title>
      <link>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29276#M21020</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;And in SCALA - assuming that df1 has a "date" column:&lt;/P&gt;import org.apache.spark.sql.functions._ import org.apache.spark.sql.types._ import org.apache.spark.sql._
&lt;P&gt;&lt;/P&gt; 
&lt;P&gt;val df2 = df1.withColumn("year", year(col("date"))) .withColumn("month", month(col("date"))) .withColumn("day", dayofmonth(col("date"))) .withColumn("hour", hour(col("date")))&lt;/P&gt; 
&lt;P&gt;df2.show(Int.MaxValue) &lt;/P&gt;</description>
      <pubDate>Tue, 26 Feb 2019 07:16:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29276#M21020</guid>
      <dc:creator>Eve</dc:creator>
      <dc:date>2019-02-26T07:16:40Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting Date into Year, Month and Day, with inconsistent delimiters</title>
      <link>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29277#M21021</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;thank you so much that was halpful &lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Feb 2019 11:05:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29277#M21021</guid>
      <dc:creator>youssefassouli</dc:creator>
      <dc:date>2019-02-26T11:05:19Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting Date into Year, Month and Day, with inconsistent delimiters</title>
      <link>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29278#M21022</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Could you please mark it as an answer, if it was helpful? &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; &lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Mar 2019 06:53:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/splitting-date-into-year-month-and-day-with-inconsistent/m-p/29278#M21022</guid>
      <dc:creator>Eve</dc:creator>
      <dc:date>2019-03-05T06:53:15Z</dc:date>
    </item>
  </channel>
</rss>

