cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Problems with pandas.read_parquet() and path

johnb1
Contributor

I am doing the "Data Engineering with Databricks V2" learning path.

I cannot run "DE 4.2 - Providing Options for External Sources", as the first code cell does not run successful:

%run ../Includes/Classroom-Setup-04.2

Screenshot 1:

MicrosoftTeams-image 

Inside the setup notebook, the code crashes at the following command (see screenshot 2):

df = pd.read_parquet(path = datasource_path.replace("dbfs:/", '/dbfs/'))

The error message is:

FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/dbacademy-datasets/data-engineering-with-databricks/v02/ecommerce/raw/users-historical'

Screenshot 2:

MicrosoftTeams-image (1) 

There seems to be an issue with the path, even though it actually exists:

Screenshot 3:

Capture 

I played around a little with the path specification, but nothing helped:

Screenshot 4:

Capture_2 

16 REPLIES 16

UmaMahesh1
Honored Contributor III

Hi @John B​ 

Can you please try by removing the dbfs and starting with /mnt only.

Also, if this does not work, can you please upload that notebooks DBC archive, so that I would be able to check the details.

Cheers..

Uma Mahesh D

Hi @Uma Maheswara Rao Desula​ 

Removing the dbfs and starting with /mnt only does not help.

Capture_3 

Br.

UmaMahesh1
Honored Contributor III

Also @John B​ 

Assuming this is an old training course, check the same using a community cluster with DBR version less than 7. Some old training courses mount points are disabled in DBR 7+.

Cheers...

Uma Mahesh D

UmaMahesh1
Honored Contributor III

@John B​ 

Did your issue get resolved?

If not through the above methods, do ping the fix you did.

Cheers..

Uma Mahesh D

@Uma Maheswara Rao Desula​ I solved the issue using ss2's suggestion (see below). After reading in a Spark DataFrame I converted it into a pandas DataFrame using the ToPandas() method.

Hi!

I can only use Runtime 7.3, 9.1., ..., 12.0. Minimum is 7.3. I am using DBR commnunity edition.

Br.

SS2
Valued Contributor

Can u try like this.spark.read.parquet("dbfs:/mnt/.......")​

johnb1
Contributor

Hi @S S​ 

Reading in the file was successful. However, I got a pyspark.sql.dataframe.DataFrame object. This is not the same as a pandas DataFrame, right?

Br.

Aviral-Bhardwaj
Esteemed Contributor III

Hey @S S​  ,

I can understand your issue

so to solve this import that DBC file and instead of question one there will be a folder for all solutions so explore solution one it will work.

Please upvote if you got some hint from my answer

Thanks

Aviral Bhardwaj

AviralBhardwaj

smkazim
New Contributor II

Hello All,

I am getting the exact issue as motioned in the first pot here. I have tried all the solutions listed: -

  1. Changing DBR to 7.3: Gave other errors related to libraries not present in that DBR version
  2. Using spark.read.parquet: This is giving "AnalysisException: Unable to infer schema for Parquet. It must be specified manually." error. I have checked the parquet files exists in that location and they are not empty.
  3. Exploring solutions folder: It is giving the same errors.

Any ideas what else I can try please.

Thanks.

vijaykumar99535
New Contributor III

I used spark.read.parquet and then convereted that to pandas dataframe and it worked for me.

Upvote if it helped you.

vijaykumar99535_0-1704360883621.png

 

mrb_cookiebaker_0-1719585418924.png

Worked for me too! Thanks

Dibs
New Contributor III

Thanks it helped.

jonathanchcc
New Contributor III

Thanks for sharing this helped me too  🤖

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group