Databricks Community

johnb1 · ‎11-30-2022

I am doing the "Data Engineering with Databricks V2" learning path.

I cannot run "DE 4.2 - Providing Options for External Sources", as the first code cell does not run successful:

%run ../Includes/Classroom-Setup-04.2

Screenshot 1:

Inside the setup notebook, the code crashes at the following command (see screenshot 2):

df = pd.read_parquet(path = datasource_path.replace("dbfs:/", '/dbfs/'))

The error message is:

FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/dbacademy-datasets/data-engineering-with-databricks/v02/ecommerce/raw/users-historical'

Screenshot 2:

There seems to be an issue with the path, even though it actually exists:

Screenshot 3:

I played around a little with the path specification, but nothing helped:

Screenshot 4:

UmaMahesh1 · ‎11-30-2022

Hi @John B

Can you please try by removing the dbfs and starting with /mnt only.

Also, if this does not work, can you please upload that notebooks DBC archive, so that I would be able to check the details.

Cheers..

Uma Mahesh D

johnb1 · ‎12-16-2022

Hi @Uma Maheswara Rao Desula

Removing the dbfs and starting with /mnt only does not help.

Br.

UmaMahesh1 · ‎11-30-2022

Also @John B

Assuming this is an old training course, check the same using a community cluster with DBR version less than 7. Some old training courses mount points are disabled in DBR 7+.

Cheers...

Uma Mahesh D

UmaMahesh1 · ‎12-03-2022

@John B

Did your issue get resolved?

If not through the above methods, do ping the fix you did.

Cheers..

Uma Mahesh D

johnb1 · ‎12-16-2022

@Uma Maheswara Rao Desula I solved the issue using ss2's suggestion (see below). After reading in a Spark DataFrame I converted it into a pandas DataFrame using the ToPandas() method.

johnb1 · ‎12-16-2022

Hi!

I can only use Runtime 7.3, 9.1., ..., 12.0. Minimum is 7.3. I am using DBR commnunity edition.

Br.

SS2 · ‎12-03-2022

Can u try like this.spark.read.parquet("dbfs:/mnt/.......")

johnb1 · ‎12-16-2022

Hi @S S

Reading in the file was successful. However, I got a pyspark.sql.dataframe.DataFrame object. This is not the same as a pandas DataFrame, right?

Br.

Aviral-Bhardwaj · ‎12-16-2022

Hey @S S ,

I can understand your issue

so to solve this import that DBC file and instead of question one there will be a folder for all solutions so explore solution one it will work.

Please upvote if you got some hint from my answer

Thanks

Aviral Bhardwaj

AviralBhardwaj

smkazim · ‎03-29-2023

Hello All,

I am getting the exact issue as motioned in the first pot here. I have tried all the solutions listed: -

Changing DBR to 7.3: Gave other errors related to libraries not present in that DBR version
Using spark.read.parquet: This is giving "AnalysisException: Unable to infer schema for Parquet. It must be specified manually." error. I have checked the parquet files exists in that location and they are not empty.
Exploring solutions folder: It is giving the same errors.

Any ideas what else I can try please.

Thanks.

vijaykumar99535 · ‎01-04-2024

I used spark.read.parquet and then convereted that to pandas dataframe and it worked for me.

Upvote if it helped you.

mrb_cookiebaker · ‎06-28-2024

Worked for me too! Thanks

Dibs · ‎08-29-2024

Thanks it helped.

jonathanchcc · ‎02-02-2024

Thanks for sharing this helped me too 🤖

Databricks Community

Problems with pandas.read_parquet() and path

Join Us as a Local Community Builder!

🎬 Databricks Community 2025 Highlights | A Year, Built Together

🌟 Community Pulse: Your Weekly Roundup! December 22, 2025 – January 04, 2026

Solution Accelerator Series | Scale cybersecurity analytics with Splunk and Databricks

🎤 Call for Presentations: Data + AI Summit 2026 is Open!

Self-Paced Learning Festival: 09 January - 30 January 2026