Databricks

Krishscientist · ‎04-12-2022

Hi Can you help me why Pandas code not working..but Pyspark is working..

import pandas as pd

pdf = pd.read_csv('/FileStore/tables/new.csv',sep=',')

Error : No such file exists...

below is worked..

df = spark.read.csv("/FileStore/tables/new.csv", sep=",", header='True')

RRO · ‎04-12-2022

It might has to do with the path as @Hubert Dudek already mentioned:

df = spark.read.csv("dbfs:/FileStore/tables/new.csv", sep=",", header='True')

View solution in original post

Hubert-Dudek · ‎04-12-2022

Try to add /dbfs/ or dbfs: prefix

Krishscientist · ‎04-12-2022

Yeah..I tried all options...still no file exists..

So I am converting Py spark DF to Pandas DF...

I am interested to know why below is not working..

pdf = pd.read_csv('/FileStore/tables/new.csv',sep=',')

RRO · ‎04-12-2022

It might has to do with the path as @Hubert Dudek already mentioned:

df = spark.read.csv("dbfs:/FileStore/tables/new.csv", sep=",", header='True')

Kaniz · ‎04-13-2022

Hi @Rafael Rockenbach and @Hubert Dudek , It was so nice to have your response. Thank you for the time you put into our community. I really want you to know how much we appreciate that.

Databricks

Py Spark Pandas Code diff

Registration now open! Databricks Data + AI Summit 2024

Meet DBRX, the New Standard for High-Quality LLMs

Data Warehousing in the Era of AI