cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to extract year and week number from a columns in a sparkDataFrame?

dshosseinyousef
New Contributor II

I have the following sparkdataframe :

sale_id/ created_at

1 /2016-05-28T05:53:31.042Z

2 /2016-05-30T12:50:58.184Z

3/ 2016-05-23T10:22:18.858Z

4 /2016-05-27T09:20:15.158Z

5 /2016-05-21T08:30:17.337Z

6 /2016-05-28T07:41:14.361Z

i need t add a year-week columns where it contains year and week number of each row in created_at column:

sale_id/ created_at /year_week

1 /2016-05-28T05:53:31.042Z /2016-21

2 /2016-05-30T12:50:58.184Z /2016-22

3/ 2016-05-23T10:22:18.858Z /2016-21

4 /2016-05-27T09:20:15.158Z /2016-21

5 /2016-05-21T08:30:17.337Z /2016-20

6 /2016-05-28T07:41:14.361Z /2016-21

Both pyspark pr SparkR or sparkSql are desirable, i have already tried lubridate package but as my columns are S4 i receive the follwing error:

Error in as.Date.default(head_df$created_at) :

Error in as.Date.default(head_df$created_at) : 
  do not know how to convert 'head_df$created_at' to class “Date”

2 REPLIES 2

theodondre
New Contributor II

val data = spark.read.option("header","true").option("inferSchema","true").csv("location of file")

import spark.implicits._

//creates year column

val year = data.withColumn("Year",year(data("created_at")))

//creates weekof year column

val week = data.withColumn("Week",weekofyear(data("created_at")))

// concatenate year and week columns

val new_df = year + "-" +week

new_df.show()

//NOTE CODE IN SCALA. I DID NOT TEST THIS CODE ON IDE, BUT IT SHOULD WORK FINE.

theodondre
New Contributor II

0693f000007OrmsAAC

THIS IS HOW HE DOCUMENTATION LOOKS LIKE

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group