cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

i can list out the file using dbutils but can not able to read files in databricks

MohammadWasi
New Contributor II

I can list out the file using dbutils but can not able to read files in databricks. PFB in screenshot. I can able to see the  file using dbutils.fs.ls but when i try to read this file using read_excel then it is showing me an error like "FileNotFoundError: [Errno 2] No such file or directory: '/FileStore/xxx/abcd.xls'". I used to read excel file as

"df_mem = pd.read_excel('/FileStore/xxx/abcd.xls', engine='openpyxl', sheet_name = 'Sheet 1')"

MohammadWasi_0-1715064354700.png

 

 

6 REPLIES 6

Kaniz
Community Manager
Community Manager

Hi @MohammadWasi, It seems like youโ€™re encountering a common issue related to file paths when working with pd.read_excel in Python.

Letโ€™s troubleshoot this step by step:

  1. Check the File Path:

    • First, ensure that the Excel file (abcd.xls) is indeed located in the specified directory (/FileStore/xxx/).
    • Double-check the spelling of the file name and the path to make sure there are no typos.
    • If the file is in a different directory, adjust the path accordingly.
  2. Use Absolute Paths:

    • Instead of using a relative path, consider using an absolute path to the file. This ensures that youโ€™re accessing the correct directory.
    • You can construct an absolute path using the os module in Python. Hereโ€™s an example:
      import os
      pre = os.path.dirname(os.path.realpath(__file__))
      fname = 'abcd.xls'
      path = os.path.join(pre, fname)
      df_mem = pd.read_excel(path, engine='openpyxl', sheet_name='Sheet 1')
      
  3. Convert to CSV (Optional):

    • If you continue to face issues, consider converting your Excel file (abcd.xls) to a CSV file.
    • Then, read the CSV file using pd.read_csv:
      df_mem = pd.read_csv('path/to/abcd.csv')
      

Remember to replace 'path/to/abcd.csv' with the actual path to your CSV file if you choose this option. Hopefully, one of these steps will resolve the issue! ๐Ÿ˜Š

If you need further assistance, feel free to ask1234.

 

MohammadWasi
New Contributor II

Thanks @Kaniz for your reply, Unfortunately I used path as variable in my code but again showing same error. PFB in screenshot.

MohammadWasi_0-1715076279718.png

 

Hi @MohammadWasi

  • Ensure that the file extension in your code matches the actual file extension. Double-check if itโ€™s โ€˜.xlsโ€™ or โ€˜.xlsxโ€™.
  • I see that you are using an incorrect file extension. Please change it and re-run your code.

MohammadWasi
New Contributor II

I am using '.xls' format only. 

Hi @MohammadWasi, Please use ".xlsx".

MohammadWasi
New Contributor II

Hi @Kaniz I changed file in .xlsx format but again got the same error as above.