Hubert-Dudek
Databricks MVP

You need to use pandas library written on top of spark dataframes. Please use for example:

from pandas import read_csv

from pyspark.pandas import read_csv

pdf = read_csv("data.csv")

more here on blog https://databricks.com/blog/2021/10/04/pandas-api-on-upcoming-apache-spark-3-2.html


My blog: https://databrickster.medium.com/

View solution in original post