Hi @liu ,
I think it could be related to following bug in Java. I suspect that internally to_timestamp_ntz uses DateTimeFormatter.
[JDK-8031085] DateTimeFormatter won't parse dates with custom format "yyyyMMddHHmmssSSS" - Java Bug ...
Now what's interesting, if the format has a decimal point before the miliseconds SSS, it can be parsed normally (
such as the format yyyyMMddHHmmss.SSS and enter 20240627235959.999).
So one workaround you can try :
from pyspark.sql.functions import to_timestamp_ntz, col, lit
df = spark.createDataFrame(
[("20250730090833000")], ["datetime"])
df2 = df.select(
"datetime",
to_timestamp(
concat(
substring("datetime", 1, 14),
lit('.'),
substring("datetime", 15, 3)
),
'yyyyMMddHHmmss.SSS'
).alias('ts')
)
df2.display()
#20250730090833000