Problems with pandas.read_parquet() and path

johnb1
Contributor

I am doing the "Data Engineering with Databricks V2" learning path.

I cannot run "DE 4.2 - Providing Options for External Sources", as the first code cell does not run successful:

%run ../Includes/Classroom-Setup-04.2

Screenshot 1:

MicrosoftTeams-image 

Inside the setup notebook, the code crashes at the following command (see screenshot 2):

df = pd.read_parquet(path = datasource_path.replace("dbfs:/", '/dbfs/'))

The error message is:

FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/dbacademy-datasets/data-engineering-with-databricks/v02/ecommerce/raw/users-historical'

Screenshot 2:

MicrosoftTeams-image (1) 

There seems to be an issue with the path, even though it actually exists:

Screenshot 3:

Capture 

I played around a little with the path specification, but nothing helped:

Screenshot 4:

Capture_2