Convert String to Timestamp
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-18-2017 10:58 AM
I have a dataset with one column of string type ('2014/12/31 18:00:36'). How can I convert it to timastamp type with PySpark?
- Labels:
-
Conversion
-
String
-
Timestamp
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2017 01:26 PM
I am trying to do it in this way, however, the result is null.
df2 = df.select(col('starting_timestamp'), df.starting_timestamp.cast('timestamp').alias('time'))
+-------------------+----+
| starting_timestamp|time|
+-------------------+----+
|2015/01/01 03:00:36|null|
|2015/01/01 03:01:06|null|
|2015/01/01 03:01:12|null|
|2015/01/01 03:01:20|null|
|2015/01/01 03:01:27|null|
+-------------------+----+
only showing top 5 rows
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2017 01:42 PM
I found the solution. It is as follows:
df2 = df.select('ID', 'starting_timestamp', unix_timestamp('starting_timestamp', "yyyy/MM/dd HH:mm:ss") .cast(TimestampType()).alias("timestamp"))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-25-2019 12:21 AM
Hi
Iam facing the same problem with the Pyspark where iam getting null after change to timestamp.The data set similar to above with some additional column
df2 = df.select('Customer', 'Transaction_Timestamp','Transaction_Base_Point_Value', unix_timestamp('Transaction_Timestamp', "yyyy/MM/dd HH:mm:ss") .cast(TimestampType()).alias("timestamp"))
|-- Customer: string (nullable = true)
|-- Transaction_Timestamp: string (nullable = true)
|-- Transaction_Base_Point_Value: integer (nullable = true)
|-- timestamp: timestamp (nullable = true)
But output of timestamp column return null
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-26-2019 05:29 AM
Hi,
It is strange that it returns null. It works fine for me in pyspark as well. Could you please compare the code? Also try displaying the earlier dataframe. pls make sure that the values in original dataframe are displaying properly and are in appropriate datatypes (StringType).
```
from pyspark.sql.functions import unix_timestamp, colfrom pyspark.sql.types import TimestampType
from pyspark.sql.types import StringType
df = spark.createDataFrame(["2015/01/01 03:00:36"], StringType()).toDF("ts_string")
df1 = df.select(unix_timestamp(df.ts_string, 'yyyy/MM/dd HH:mm:ss').cast(TimestampType()).alias("timestamp"))
df1.show()
```
If it still doesn't resolve, please share the full code, including how you are creating the original dataframe. Please let us know how it goes.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-08-2019 06:49 AM
Hi
i have Spark 1.6.0 on Cloudera 5.13.0
i have the same problem and this is my full code , please help me
this is the format of my row : 25/Jan/2016:21:26:37 +0100
from pyspark.sql import HiveContext
from pyspark.sql.functions import unix_timestamp, col
from pyspark.sql.types import TimestampType
from pyspark.sql.types import StringType
SQLContext = HiveContext(sc)
df=sqlContext.sql("select * from test.test")
df1 = df.select(unix_timestamp(df.date_hour, 'yyyy/MM/dd:HH:mm:ss').cast(TimestampType()).alias("timestamp"))
df1.show()
it still null
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-19-2019 02:31 AM
hope you dont mind if i ask you to elaborate further for a shaper understanding? see my basketball court layout at https://www.recreationtipsy.com/basketball-court/
![](/skins/images/582998B45490C7019731A5B3A872C751/responsive_peak/images/icon_anonymous_message.png)
![](/skins/images/582998B45490C7019731A5B3A872C751/responsive_peak/images/icon_anonymous_message.png)