I presume then that the "datetime" string you provided is a string column?
If so, try the following code. This will not perform any transformation to timestamps:
from pyspark.sql.functions import lit, split, concat
data = [{"date_string": "2002-01-01T00:00:00.000"}]
data_df = spark.createDataFrame(data) \
.withColumn(
"date_string_reformatted", concat(
lit(split("date_string", "-")[1]),
lit("-"),
lit(split(split("date_string", "-")[2], "T")[0]),
lit("-"),
lit(split("date_string", "-")[0]),
lit(" "),
lit(split(split(split("date_string", "-")[2], "T")[1], "\.")[0]),
)
)
data_df.display()
The result:
โ