โ01-28-2023 04:51 AM
Hi all,
I've a dataframe with CreateDate column with this format:
CreateDate
/Date(1593786688000+0200)/
/Date(1446032157000+0100)/
/Date(1533904635000+0200)/
/Date(1447839805000+0100)/
/Date(1589451249000+0200)/
and I want to convert that format to date/timestamp, so the excepted output will be:
CreateDate
2020-07-03 14:31:28 +02:00
2015-10-28 11:35:57 +01:00
2018-08-10 12:37:15 +02:00
2015-11-18 09:43:25 +01:00
2020-05-14 10:14:09 +02:00
I have this query in SQL that gives the desired output and that can help to develop:
cast(convert(VARCHAR(30), DATEADD(Second, convert(BIGINT, left(replace(replace(CreateDate, '/date(', ''), ')/', ''), 13)) / 1000, '1970-01-01 00:00:00'), 20) + ' ' + '+' + left(right(replace(replace(CreateDate, '/date(', ''), ')/', ''), 4), 2) + ':' + right(replace(replace(CreateDate, '/date(', ''), ')/', ''), 2) AS DATETIMEOFFSET(0)) AS CreateDate
Can anyone please help me in achieving this?
Thank you!
โ01-28-2023 08:34 PM
Hi @Bruno Francoโ ,
Can you please try the below code, hope it might for you.
from pyspark.sql.functions import from_unixtime
from pyspark.sql import functions as F
final_df = df_src.withColumn("Final_Timestamp", from_unixtime((F.regexp_extract(col("CreateDate"), "(\d+)", 1)/1000),"yyyy-MM-dd HH:mm:ss"))\
.withColumn("Offset_Time", F.regexp_extract(split(df_source['CreateDate'],'\\+')[1], "(\d+)", 1))
display(final_df)
I have divided the value with 1000 because from_unixtime takes arguments in seconds, and your
timestamp is in milliseconds.
Happy Learning!!
โ01-28-2023 08:34 PM
Hi @Bruno Francoโ ,
Can you please try the below code, hope it might for you.
from pyspark.sql.functions import from_unixtime
from pyspark.sql import functions as F
final_df = df_src.withColumn("Final_Timestamp", from_unixtime((F.regexp_extract(col("CreateDate"), "(\d+)", 1)/1000),"yyyy-MM-dd HH:mm:ss"))\
.withColumn("Offset_Time", F.regexp_extract(split(df_source['CreateDate'],'\\+')[1], "(\d+)", 1))
display(final_df)
I have divided the value with 1000 because from_unixtime takes arguments in seconds, and your
timestamp is in milliseconds.
Happy Learning!!
โ01-29-2023 03:43 AM
Thanks a lot @Ratna Chaitanya Raju Bandaruโ , I picked up your code and I did this:
final_df = df.withColumn("CreateDateNew", concat(from_unixtime((F.regexp_extract(col("CreateDate"), "(\d+)", 1)/1000),"yyyy-MM-dd HH:mm:ss"), lit(" "), regexp_replace(regexp_extract("CreateDate", re, 2), "(\\d{2})(\\d{2})" , "$1:$2" )))
and I got the desired output:
Once again, thanks and kind regards ๐
โ01-29-2023 04:00 AM
Hi @Bruno Francoโ ,
Glad it helped you
Happy Learning!!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group