When formatting dates using the yyyyMMddHHmmssSSS pattern, an error occurred
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2025 12:35 AM
An error occurred while converting a timestamp in the yyyyMMddHHmmssSSS format
from pyspark.sql.functions import to_timestamp_ntz, col, lit
df = spark.createDataFrame(
[("20250730090833000")], ["datetime"])
df2 = df.withColumn("dateformat", to_timestamp_ntz(col("datetime"),lit("yyyyMMddHHmmssSSS")))
df2.display()
error
Text '20250730090833000' could not be parsed at index 0. Use `try_to_timestamp` to tolerate invalid input string and return NULL instead. SQLSTATE: 22007
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2025 01:21 AM
hi @liu
I suspect supplying the format as a column is the issue. Could you please try:
df2 = df.withColumn("dateformat", to_timestamp_ntz(col("datetime"),"yyyyMMddHHmmssSSS"))
TheOC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2025 06:30 PM
hi @TheOC
Thank you for your response.
If I don’t use lit, I get an error saying this column cannot be found.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2025 02:17 AM
Hi @liu ,
I think it could be related to following bug in Java. I suspect that internally to_timestamp_ntz uses DateTimeFormatter.
Now what's interesting, if the format has a decimal point before the miliseconds SSS, it can be parsed normally (
such as the format yyyyMMddHHmmss.SSS and enter 20240627235959.999).
So one workaround you can try :
from pyspark.sql.functions import to_timestamp_ntz, col, lit
df = spark.createDataFrame(
[("20250730090833000")], ["datetime"])
df2 = df.select(
"datetime",
to_timestamp(
concat(
substring("datetime", 1, 14),
lit('.'),
substring("datetime", 15, 3)
),
'yyyyMMddHHmmss.SSS'
).alias('ts')
)
df2.display()
#20250730090833000
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2025 06:48 PM
Thank you for your respons.
I did apply some substring concatenation logic to make the conversion work, but the most straightforward way is still using the yyyyMMddHHmmssSSS format.
I checked the link you shared this appears to be a bug dating back to 2013, yet it's still impacting us in 2025. That’s really unfortunate.