Hello,
I made some transform on pyspark.sql.Column object:
file_path_splitted=f.split(df[filepath_col_name],'/') # return Column object
file_name = file_path_splitted[f.size(file_path_splitted) - 1] # return Column object
Next I used variable "file_name" in DataFrame.withColumn method
df_with_file_name=df.withColumn('is_long_file_name',f.when((f.length(file_name) == 100), 'Yes')
.otherwise('No'))
My question is:
is there any risk that making transform on pyspark.sql.Column outside of "withColumn" method can missmach rows from pyspark.sql.Column and data frame? I mean the situation that the rows in the Column object can be sorted in the diffrent order and in the result dataframe and new column will be missmatch.