Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-16-2019 05:20 AM
Hi @Emiliano Parizzi,
You could parsed the timestamp after loading the file with using the withColumn (cf. https://stackoverflow.com/questions/39088473/pyspark-dataframe-convert-unusual-string-format-to-time....
from pyspark.sql import Row from pyspark.sql.functions import to_timestamp
(sc .parallelize([Row(dt='02/07/2019 14:51:32.869-08:00')]) .toDF() .withColumn("parsed", to_timestamp("dt", "MM/dd/yyyy HH:mm:ss.SSSXXX")) .show(1, False))
+-----------------------------+-------------------+ |dt |parsed | +-----------------------------+-------------------+ |02/07/2019 14:51:32.869-08:00|2019-02-07 22:51:32| +-----------------------------+-------------------+