<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic 'No file or Directory' error when using pandas.read_excel in Databricks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38391#M26618</link>
    <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;I am baffled by the behaviour of Databricks:&lt;/P&gt;&lt;P&gt;Below you can see the contents of the directory using dbutils in Databricks. It shows the `test.xlsx` file clearly in directory (and I can even open it using `dbutils.fs.head`) But when I go to use panda.read_excel to read it, I get the error below stating it can't be found...&lt;/P&gt;&lt;P&gt;I am running the commands on a Unity Catalog/Shared Cluster with Databricks runtime 13.2 (Spark v3.4.0)&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="wCLqf" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/2995i7D5069E2F873902D/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="wCLqf" alt="wCLqf" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Tue, 25 Jul 2023 14:23:02 GMT</pubDate>
    <dc:creator>anirudh_a</dc:creator>
    <dc:date>2023-07-25T14:23:02Z</dc:date>
    <item>
      <title>'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38391#M26618</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;I am baffled by the behaviour of Databricks:&lt;/P&gt;&lt;P&gt;Below you can see the contents of the directory using dbutils in Databricks. It shows the `test.xlsx` file clearly in directory (and I can even open it using `dbutils.fs.head`) But when I go to use panda.read_excel to read it, I get the error below stating it can't be found...&lt;/P&gt;&lt;P&gt;I am running the commands on a Unity Catalog/Shared Cluster with Databricks runtime 13.2 (Spark v3.4.0)&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="wCLqf" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/2995i7D5069E2F873902D/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="wCLqf" alt="wCLqf" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 25 Jul 2023 14:23:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38391#M26618</guid>
      <dc:creator>anirudh_a</dc:creator>
      <dc:date>2023-07-25T14:23:02Z</dc:date>
    </item>
    <item>
      <title>Re: 'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38397#M26622</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Could you please see if the limitations were followed:&amp;nbsp;&lt;A href="https://docs.databricks.com/files/index.html#local-file-api-limitations" target="_blank"&gt;https://docs.databricks.com/files/index.html#local-file-api-limitations&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Also, could you please try to write it with dbfs:// at the beginning?&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please tag &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/26078"&gt;@Debayan&lt;/a&gt;&amp;nbsp;&amp;nbsp;with your next comment which will notify me. Thanks!&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2023 15:22:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38397#M26622</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2023-07-25T15:22:38Z</dc:date>
    </item>
    <item>
      <title>Re: 'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38400#M26624</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/26078"&gt;@Debayan&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;I get the following message:&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Use "/dbfs", not "dbfs:"&lt;/STRONG&gt;: The function expects a local file path. The error is caused by passing a path prefixed with "dbfs:".&lt;/P&gt;&lt;P&gt;Having a look here &lt;A href="https://learn.microsoft.com/en-us/azure/databricks/dbfs/unity-catalog" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/azure/databricks/dbfs/unity-catalog&lt;/A&gt; I read the following excerpt and I am wondering If I am missing something...like I said I am using the Unity Catalog on a shared cluster&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV class=""&gt;&lt;H2&gt;How does DBFS work in shared access mode?&lt;/H2&gt;&lt;/DIV&gt;&lt;P&gt;Shared access mode combines Unity Catalog data governance with Azure Databricks legacy table ACLs. Access to data in the hive_metastore is only available to users that have permissions explicitly granted.&lt;/P&gt;&lt;P&gt;To interact with files directly using DBFS, you must have ANY FILE permissions granted. Because ANY FILE allows users to bypass legacy tables ACLs in the hive_metastore and access all data managed by DBFS, Databricks recommends caution when granting this privilege.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Shared access mode does not support DBFS root or mounts.&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2023 16:07:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38400#M26624</guid>
      <dc:creator>anirudh_a</dc:creator>
      <dc:date>2023-07-25T16:07:12Z</dc:date>
    </item>
    <item>
      <title>Re: 'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38510#M26655</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;SPAN&gt;An admin must grant &lt;/SPAN&gt;&lt;SPAN&gt;SELECT&lt;/SPAN&gt;&lt;SPAN&gt; permission on files so the selected user can create a table.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;You can also refer to&amp;nbsp;&lt;A href="https://kb.databricks.com/en_US/data/user-does-not-have-permission-select-on-any-file" target="_blank"&gt;https://kb.databricks.com/en_US/data/user-does-not-have-permission-select-on-any-file&lt;/A&gt;.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jul 2023 15:25:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38510#M26655</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2023-07-26T15:25:59Z</dc:date>
    </item>
    <item>
      <title>Re: 'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38698#M26723</link>
      <description>&lt;P&gt;This did not do anything...as far as I know applying the SELECT perms only pertain to tables in Catalogs. Still can't read files fro Dbfs.&lt;/P&gt;</description>
      <pubDate>Fri, 28 Jul 2023 20:20:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38698#M26723</guid>
      <dc:creator>anirudh_a</dc:creator>
      <dc:date>2023-07-28T20:20:16Z</dc:date>
    </item>
    <item>
      <title>Re: 'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38741#M26740</link>
      <description>&lt;P&gt;Hi, Could you please raise a support case so that we can investigate and triage on this?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 31 Jul 2023 05:16:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/38741#M26740</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2023-07-31T05:16:16Z</dc:date>
    </item>
    <item>
      <title>Re: 'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/39632#M27042</link>
      <description>&lt;P&gt;This is a really frustrating design choice - in a Unity SAM cluster, Databricks disabled the filesystem mount for DBFS that allows it to be read through vanilla Python, but left it in place for PySpark, because their implementation supports access control through Spark but not through Python.&lt;/P&gt;&lt;P&gt;As a workaround for reading data specifically, you can read the file from pyspark first, and then convert it to a vanilla Pandas object, but this does not work for all workflows, and is very poorly documented.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2023 12:26:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/39632#M27042</guid>
      <dc:creator>JameDavi_51481</dc:creator>
      <dc:date>2023-08-11T12:26:27Z</dc:date>
    </item>
    <item>
      <title>Re: 'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/40023#M27110</link>
      <description>&lt;P&gt;We used to have this problem, but worked around it by having the files in a UC external location, and using spark pandas instead of regular pandas, since it uses the spark access model and this works with UC grants.&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, with the recent addition of UC Volumes, you can add the location as as a &lt;A href="https://docs.databricks.com/en/sql/language-manual/sql-ref-volumes.html#:~:text=You%20can%20use%20volumes%20to,can%20be%20managed%20or%20external." target="_self"&gt;volume&lt;/A&gt;&amp;nbsp;and access it as you would a file in a regular file system.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Aug 2023 06:59:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/40023#M27110</guid>
      <dc:creator>knutasm</dc:creator>
      <dc:date>2023-08-16T06:59:49Z</dc:date>
    </item>
    <item>
      <title>Re: 'No file or Directory' error when using pandas.read_excel in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/50351#M28779</link>
      <description>&lt;P&gt;Hey, I encountered it recently. I can see you are using the shared cluster, try switching to a single user cluster and it will fix it.&lt;/P&gt;&lt;P&gt;Can someone let me know why it wasn't working w a shared cluster?&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Nov 2023 11:33:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/no-file-or-directory-error-when-using-pandas-read-excel-in/m-p/50351#M28779</guid>
      <dc:creator>DamnKush</dc:creator>
      <dc:date>2023-11-02T11:33:09Z</dc:date>
    </item>
  </channel>
</rss>

