Hi everyone,
Please note that I stuck with exercise 2.0 Train and Validate ML Model because when I run code appear a NameError with the following label: name 'DoubleType' is not defined.
I put the code bellow for your reference.
I would like any help about this subject.
def get_sin_cosine(value, max_value):
sine = np.sin(value * (2.*np.pi/max_value))
cosine = np.cos(value * (2.*np.pi/max_value))
return (sine.tolist(), cosine.tolist())
schema = StructType([
StructField("sine", DoubleType(), False),
StructField("cosine", DoubleType(), False)
])
get_sin_cosineUDF = udf(get_sin_cosine, schema)
dataset = dataset.withColumn("udfResult", get_sin_cosineUDF(col("hour_of_day"), lit(24))).withColumn("hour_sine", col("udfResult.sine")).withColumn("hour_cosine", col("udfResult.cosine")).drop("udfResult").drop("hour_of_day")
dataset = dataset.filter(dataset.totalAmount.isNotNull())
dataset = dataset.withColumn("isPaidTimeOff", col("isPaidTimeOff").cast("integer"))
display(dataset)