topic Re: pandas.read_csv in Data Engineering

pandas.read_csv

MarcoMistroni — Wed, 13 Sep 2017 21:23:11 GMT

HI all

i have uploaded a file on my cluster , at location

/FileStore/tables/qmwxhxvi1505337108590/PastHires.csv

However, whenever i try to read it using panda

df = pd.read_csv('dbfs:/FileStore/tables/qmwxhxvi1505337108590/PastHires.csv')

, i alwasy get a

File dbfs:/FileStore/tables/qmwxhxvi1505337108590/PastHires.csv does not exist

how can i get around it?

kind regards

Re: pandas.read_csv

it_live — Sun, 24 Sep 2017 16:19:18 GMT

Hi, i also struggled to get pandas read from csv. Use the below code with your path with a replacement of dbfs: with /dbfs and remove the header=True to make it works in databricks python notebook. you will end up with: pandas_df = pd.read_csv("/dbfs/FileStore/tables/2esy8tnj1455052720017/part_001-86465.tsv");

FYI reference Databricks Docs :https://docs.databricks.com/user-guide/importing-data.html Original statement not working : pandas_df = pd.read_csv("/dbfs/FileStore/tables/2esy8tnj1455052720017/part_001-86465.tsv", header=True)

Good Luck IT

Re: pandas.read_csv

MarcoMistroni — Sun, 24 Sep 2017 20:27:47 GMT

Hello

thanks.. that helped

also for some unknown reason my notebook didnt display any output at all and i thought there was something going on withe code

Now i can see my original dataframe. many thanks

Re: pandas.read_csv

rohitshah — Mon, 07 Sep 2020 10:57:12 GMT

I am also having same issue, I have uploaded file in DBFS and it gives some default code which itself is not working.

Is anyone has solved this issue ?

Re: pandas.read_csv

cgnarendiran — Mon, 07 Sep 2020 15:28:29 GMT

I'm facing the same issue. However there is a workaround posted here: https://forums.databricks.com/questions/18254/unable-to-read-file-using-pandas.html

Basically read the csv using spark and then convert to pandas