Databricks Community

Krishscientist · ‎04-12-2022

Hi Can you help me why Pandas code not working..but Pyspark is working..

import pandas as pd

pdf = pd.read_csv('/FileStore/tables/new.csv',sep=',')

Error : No such file exists...

below is worked..

df = spark.read.csv("/FileStore/tables/new.csv", sep=",", header='True')

RRO · ‎04-12-2022

It might has to do with the path as @Hubert Dudek already mentioned:

df = spark.read.csv("dbfs:/FileStore/tables/new.csv", sep=",", header='True')

Hubert-Dudek · ‎04-12-2022

Try to add /dbfs/ or dbfs: prefix

My blog: https://databrickster.medium.com/

Krishscientist · ‎04-12-2022

Yeah..I tried all options...still no file exists..

So I am converting Py spark DF to Pandas DF...

I am interested to know why below is not working..

pdf = pd.read_csv('/FileStore/tables/new.csv',sep=',')

RRO · ‎04-12-2022

It might has to do with the path as @Hubert Dudek already mentioned:

df = spark.read.csv("dbfs:/FileStore/tables/new.csv", sep=",", header='True')

Py Spark Pandas Code diff