Py Spark Pandas Code diff

Krishscientist
New Contributor III

Hi Can you help me why Pandas code not working..but Pyspark is working..

import pandas as pd

pdf = pd.read_csv('/FileStore/tables/new.csv',sep=',')

Error : No such file exists...

below is worked..

df = spark.read.csv("/FileStore/tables/new.csv", sep=",", header='True')

Hubert-Dudek
Databricks MVP

Try to add /dbfs/ or dbfs: prefix​


My blog: https://databrickster.medium.com/

Krishscientist
New Contributor III

Yeah..I tried all options...still no file exists..

So I am converting Py spark DF to Pandas DF...

I am interested to know why below is not working..

pdf = pd.read_csv('/FileStore/tables/new.csv',sep=',')

RRO
Databricks Partner

It might has to do with the path as @Hubert Dudek​  already mentioned:

df = spark.read.csv("dbfs:/FileStore/tables/new.csv", sep=",", header='True')

View solution in original post