cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Date schema issues with pyspark dataframe creation

ckwan48
New Contributor III

I'm having some issues with creating a dataframe with a date column. Could I know what is wrong?

from pyspark.sql import SparkSession
from pyspark.sql.types import StructType
from pyspark.sql.types import DateType, FloatType
 
spark = SparkSession.builder.appName('DataFrame').getOrCreate()
schema = StructType() \
      .add("DATE", DateType(), True) \
      .add("A", FloatType(), True) \
      .add("B", FloatType(), True)
 
df = spark.read.format("csv").option("header", True).option("dateFormat", "MM/dd/yyyy").schema(schem).load(''test.csv")
 
df.show()

This is the error I'm getting:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 158.0 failed 4 times, most recent failure: Lost task 0.3 in stage 158.0 (TID 1823) (10.237.208.145 executor 5): org.apache.spark.SparkUpgradeException: [INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] You may get a different result due to the upgrading to Spark >= 3.0:

4 REPLIES 4

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi @Kevin Kim​ , Could you please try upgrading the spark version? Also, please provide the full error logs.

Kaniz
Community Manager
Community Manager

Hi @Kevin Kim​, We haven’t heard from you since the last response from @Debayan Mukherjee​ ​ , and I was checking back to see if his suggestions helped you.

Or else, If you have any solution, please share it with the community, as it can be helpful to others.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

ckwan48
New Contributor III

Hi @Kaniz Fatma​,

I actually changed the date format to 'M/d/Y' and it didn't throw any errors. I found in my csv file that it had dates like '3/1/2022'. Could that be the issue? But some dates also were like '12/1/2022. So I'm kind of confused.

Kaniz
Community Manager
Community Manager

Hi @Kevin Kim​, Could you please respond to @Debayan Mukherjee​'s response over this thread? Also, please provide full error logs.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.