I m executing the below code and using Pyhton in notebook and it appears that the col() function is not getting recognized .
I want to know if the col() function belongs to any specific Dataframe library or Python library .I dont want to use pyspark api and would like to write code using sql dataframes API
Trying to run the below code and getting error -NameError: name 'col' is not defined
peopleDF = spark.read.parquet("/mnt/training/dataframes/people-10m.parquet") peopleDF.printSchema() peopleDF.show() peopleDF.select(col("firstName")).filter(col("firstName"))=="An"
As per SPARK doc
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Column
df("columnName") // On a specific `df` DataFrame.
col("columnName") // A generic column not yet associated with a DataFrame.
col("columnName.field") // Extracting a struct field
col("`a.column.with.dots`") // Escape `.` in column names.
$"columnName" // Scala short hand for a named column.