from_utc_time gives strange results
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thursday
I don't understand why from_utc_time(col("original_time"), "Europe/Berlin") changes the timestamp instead of just setting the timezone. That's a non-intuitive behaviour.
spark.conf.set("spark.sql.session.timeZone", "UTC")
from pyspark.sql import Row
from pyspark.sql.types import StructType, StructField, TimestampType,StringType
from pyspark.sql.functions import col,from_utc_timestamp,unix_timestamp
data = [Row(Zeit="1970-01-01 00:00")]
schema = StructType([StructField("original_time", StringType(), True)])
df = spark.createDataFrame(data, schema)
df = df.withColumn("original_time", col("original_time").cast("timestamp"))
df = df.withColumn("original_time_int",unix_timestamp(col("original_time")))
df = df.withColumn("berlin_time", from_utc_timestamp(col("original_time"), "Europe/Berlin"))
df = df.withColumn("berlin_time_int", unix_timestamp(col("berlin_time")))
display(df)
0 REPLIES 0

