<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic to_timstamp function in non-legacy mode does not parse this format: yyyyMMddHHmmssSS in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/to-timstamp-function-in-non-legacy-mode-does-not-parse-this/m-p/2833#M107</link>
    <description>&lt;P&gt;I have this datetime string in my dataset: '2023061218154258' and I want to convert it to datetime, using below code. However the format that I expect to work, doesn't work, namely: yyyyMMddHHmmssSS. This code will reproduce the issue:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.functions import *
spark.conf.set("spark.sql.legacy.timeParserPolicy","CORRECTED")
# If the config is set to CORRECTED then the conversion will return null instead of throwing an exception.
&amp;nbsp;
df=spark.createDataFrame(
         data=[ ("1",  "2023061218154258")
                , ("2", "20230612181542.58")]
        ,schema=["id","input_timestamp"])
df.printSchema()
&amp;nbsp;
#Timestamp String to DateType
1. df.withColumn("timestamp",to_timestamp("input_timestamp", format = 'yyyyMMddHHmmssSS')).show(truncate=False)
df.withColumn("timestamp",to_timestamp("input_timestamp", format = 'yyyyMMddHHmmss.SS')).show(truncate=False)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;output:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;+---+-----------------+---------+
|id |input_timestamp  |timestamp|
+---+-----------------+---------+
|1  |2023061218154258 |null     |
|2  |20230612181542.58|null     |
+---+-----------------+---------+
&amp;nbsp;
+---+-----------------+----------------------+
|id |input_timestamp  |timestamp             |
+---+-----------------+----------------------+
|1  |2023061218154258 |null                  |
|2  |20230612181542.58|2023-06-12 18:15:42.58|
+---+-----------------+----------------------+&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I tried to_timestamp with the format yyyyMMddHHmmssSS and I expected that it would convert the string 2023061218154258 into the timestamp 2023-06-12 18:15:42.58&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;When I change the line &lt;/P&gt;&lt;P&gt;spark.conf.set("spark.sql.legacy.timeParserPolicy","CORRECTED")&lt;/P&gt;&lt;P&gt;into &lt;/P&gt;&lt;P&gt;spark.conf.set("spark.sql.legacy.timeParserPolicy","LEGACY")  the issue is solved, but I don't want to use legacy mode (because it gives other issues). &lt;/P&gt;</description>
    <pubDate>Tue, 20 Jun 2023 08:51:36 GMT</pubDate>
    <dc:creator>b_1</dc:creator>
    <dc:date>2023-06-20T08:51:36Z</dc:date>
    <item>
      <title>to_timstamp function in non-legacy mode does not parse this format: yyyyMMddHHmmssSS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-timstamp-function-in-non-legacy-mode-does-not-parse-this/m-p/2833#M107</link>
      <description>&lt;P&gt;I have this datetime string in my dataset: '2023061218154258' and I want to convert it to datetime, using below code. However the format that I expect to work, doesn't work, namely: yyyyMMddHHmmssSS. This code will reproduce the issue:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.functions import *
spark.conf.set("spark.sql.legacy.timeParserPolicy","CORRECTED")
# If the config is set to CORRECTED then the conversion will return null instead of throwing an exception.
&amp;nbsp;
df=spark.createDataFrame(
         data=[ ("1",  "2023061218154258")
                , ("2", "20230612181542.58")]
        ,schema=["id","input_timestamp"])
df.printSchema()
&amp;nbsp;
#Timestamp String to DateType
1. df.withColumn("timestamp",to_timestamp("input_timestamp", format = 'yyyyMMddHHmmssSS')).show(truncate=False)
df.withColumn("timestamp",to_timestamp("input_timestamp", format = 'yyyyMMddHHmmss.SS')).show(truncate=False)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;output:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;+---+-----------------+---------+
|id |input_timestamp  |timestamp|
+---+-----------------+---------+
|1  |2023061218154258 |null     |
|2  |20230612181542.58|null     |
+---+-----------------+---------+
&amp;nbsp;
+---+-----------------+----------------------+
|id |input_timestamp  |timestamp             |
+---+-----------------+----------------------+
|1  |2023061218154258 |null                  |
|2  |20230612181542.58|2023-06-12 18:15:42.58|
+---+-----------------+----------------------+&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I tried to_timestamp with the format yyyyMMddHHmmssSS and I expected that it would convert the string 2023061218154258 into the timestamp 2023-06-12 18:15:42.58&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;When I change the line &lt;/P&gt;&lt;P&gt;spark.conf.set("spark.sql.legacy.timeParserPolicy","CORRECTED")&lt;/P&gt;&lt;P&gt;into &lt;/P&gt;&lt;P&gt;spark.conf.set("spark.sql.legacy.timeParserPolicy","LEGACY")  the issue is solved, but I don't want to use legacy mode (because it gives other issues). &lt;/P&gt;</description>
      <pubDate>Tue, 20 Jun 2023 08:51:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-timstamp-function-in-non-legacy-mode-does-not-parse-this/m-p/2833#M107</guid>
      <dc:creator>b_1</dc:creator>
      <dc:date>2023-06-20T08:51:36Z</dc:date>
    </item>
    <item>
      <title>Re: to_timstamp function in non-legacy mode does not parse this format: yyyyMMddHHmmssSS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-timstamp-function-in-non-legacy-mode-does-not-parse-this/m-p/2834#M108</link>
      <description>&lt;P&gt;Hi @Bas van den Berg​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Great to meet you, and thanks for your question! &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Let's see if your peers in the community have an answer to your question. Thanks.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jun 2023 06:10:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-timstamp-function-in-non-legacy-mode-does-not-parse-this/m-p/2834#M108</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-06-21T06:10:53Z</dc:date>
    </item>
    <item>
      <title>Re: to_timstamp function in non-legacy mode does not parse this format: yyyyMMddHHmmssSS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-timstamp-function-in-non-legacy-mode-does-not-parse-this/m-p/48966#M28435</link>
      <description>&lt;P&gt;Is there anybody who has the same issue or knows that this is in fact an issue?&lt;/P&gt;</description>
      <pubDate>Wed, 11 Oct 2023 16:20:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-timstamp-function-in-non-legacy-mode-does-not-parse-this/m-p/48966#M28435</guid>
      <dc:creator>b_1</dc:creator>
      <dc:date>2023-10-11T16:20:05Z</dc:date>
    </item>
  </channel>
</rss>

