cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

i can list out the file using dbutils but can not able to read files in databricks

MohammadWasi
New Contributor II

I can list out the file using dbutils but can not able to read files in databricks. PFB in screenshot. I can able to see the  file using dbutils.fs.ls but when i try to read this file using read_excel then it is showing me an error like "FileNotFoundError: [Errno 2] No such file or directory: '/FileStore/xxx/abcd.xls'". I used to read excel file as

"df_mem = pd.read_excel('/FileStore/xxx/abcd.xls', engine='openpyxl', sheet_name = 'Sheet 1')"

MohammadWasi_0-1715064354700.png

 

 

6 REPLIES 6

Kaniz_Fatma
Community Manager
Community Manager

Hi @MohammadWasi, It seems like you’re encountering a common issue related to file paths when working with pd.read_excel in Python.

Let’s troubleshoot this step by step:

  1. Check the File Path:

    • First, ensure that the Excel file (abcd.xls) is indeed located in the specified directory (/FileStore/xxx/).
    • Double-check the spelling of the file name and the path to make sure there are no typos.
    • If the file is in a different directory, adjust the path accordingly.
  2. Use Absolute Paths:

    • Instead of using a relative path, consider using an absolute path to the file. This ensures that you’re accessing the correct directory.
    • You can construct an absolute path using the os module in Python. Here’s an example:
      import os
      pre = os.path.dirname(os.path.realpath(__file__))
      fname = 'abcd.xls'
      path = os.path.join(pre, fname)
      df_mem = pd.read_excel(path, engine='openpyxl', sheet_name='Sheet 1')
      
  3. Convert to CSV (Optional):

    • If you continue to face issues, consider converting your Excel file (abcd.xls) to a CSV file.
    • Then, read the CSV file using pd.read_csv:
      df_mem = pd.read_csv('path/to/abcd.csv')
      

Remember to replace 'path/to/abcd.csv' with the actual path to your CSV file if you choose this option. Hopefully, one of these steps will resolve the issue! 😊

If you need further assistance, feel free to ask1234.

 

Thanks @Kaniz_Fatma for your reply, Unfortunately I used path as variable in my code but again showing same error. PFB in screenshot.

MohammadWasi_0-1715076279718.png

 

Hi @MohammadWasi

  • Ensure that the file extension in your code matches the actual file extension. Double-check if it’s ‘.xls’ or ‘.xlsx’.
  • I see that you are using an incorrect file extension. Please change it and re-run your code.

MohammadWasi
New Contributor II

I am using '.xls' format only. 

Hi @MohammadWasi, Please use ".xlsx".

MohammadWasi
New Contributor II

Hi @Kaniz_Fatma I changed file in .xlsx format but again got the same error as above.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group