<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic File found with %fs ls but not with spark.read in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78417#M9144</link>
    <description>&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Code:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;wikipediaDF &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (spark.read&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"HEADER"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"inferSchema"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;csv&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"/databricks-datasets/wikipedia-datasets/data-001/pageviews/raw/pageviews_by_second.tsv"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;display&lt;/SPAN&gt;&lt;SPAN&gt;(bostonDF)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Error:&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;Failed to store the result. Try rerunning the command.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;Failed to upload command result to DBFS. Error message: PUT request to create file error HttpResponseProxy{HTTP/1.1 404 The specified filesystem does not exist. [Content-Length: 175, Content-Type: application/json;charset=utf-8, Server: Windows-Azure-HDFS/1.0 Microsoft-HTTPAPI/2.0, x-ms-error-code: FilesystemNotFound, x-ms-request-id: 614c7044-901f-004d-1bd4-d3b66f000000, x-ms-version: 2021-04-10, Date: Thu, 11 Jul 2024 20:52:49 GMT] ResponseEntityProxy{[Content-Type: application/json;charset=utf-8,Content-Length: 175,Chunked: false]}}&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;The files are open databases shared by databricks, I was always able to open them but now I'm not&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Thu, 11 Jul 2024 21:00:22 GMT</pubDate>
    <dc:creator>joseroca99</dc:creator>
    <dc:date>2024-07-11T21:00:22Z</dc:date>
    <item>
      <title>File found with %fs ls but not with spark.read</title>
      <link>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78417#M9144</link>
      <description>&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Code:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;wikipediaDF &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (spark.read&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"HEADER"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"inferSchema"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;csv&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"/databricks-datasets/wikipedia-datasets/data-001/pageviews/raw/pageviews_by_second.tsv"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;display&lt;/SPAN&gt;&lt;SPAN&gt;(bostonDF)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Error:&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;Failed to store the result. Try rerunning the command.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;Failed to upload command result to DBFS. Error message: PUT request to create file error HttpResponseProxy{HTTP/1.1 404 The specified filesystem does not exist. [Content-Length: 175, Content-Type: application/json;charset=utf-8, Server: Windows-Azure-HDFS/1.0 Microsoft-HTTPAPI/2.0, x-ms-error-code: FilesystemNotFound, x-ms-request-id: 614c7044-901f-004d-1bd4-d3b66f000000, x-ms-version: 2021-04-10, Date: Thu, 11 Jul 2024 20:52:49 GMT] ResponseEntityProxy{[Content-Type: application/json;charset=utf-8,Content-Length: 175,Chunked: false]}}&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;The files are open databases shared by databricks, I was always able to open them but now I'm not&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 11 Jul 2024 21:00:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78417#M9144</guid>
      <dc:creator>joseroca99</dc:creator>
      <dc:date>2024-07-11T21:00:22Z</dc:date>
    </item>
    <item>
      <title>Re: File found with %fs ls but not with spark.read</title>
      <link>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78430#M9145</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/111819"&gt;@joseroca99&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Try to add filesystem type to your path. Something like that: dbfs:/databricks-datasets/wikipedia-datasets/&lt;SPAN&gt;data-001/pageviews/raw/pageviews_by_second&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;L&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 02:34:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78430#M9145</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2024-07-12T02:34:47Z</dc:date>
    </item>
    <item>
      <title>Re: File found with %fs ls but not with spark.read</title>
      <link>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78433#M9146</link>
      <description>&lt;P&gt;Depending on where did you find the file using %fs you should use appropriate filesystem pre-fix.&lt;BR /&gt;If its in dbfs use dbfs:/YOUR_PATH&lt;BR /&gt;If its in local file system try with - file:/&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 04:21:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78433#M9146</guid>
      <dc:creator>p4pratikjain</dc:creator>
      <dc:date>2024-07-12T04:21:46Z</dc:date>
    </item>
    <item>
      <title>Re: File found with %fs ls but not with spark.read</title>
      <link>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78544#M9147</link>
      <description>&lt;P&gt;I tried writing dbfs: and /dbfs before the path, still not working&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 14:21:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78544#M9147</guid>
      <dc:creator>joseroca99</dc:creator>
      <dc:date>2024-07-12T14:21:21Z</dc:date>
    </item>
    <item>
      <title>Re: File found with %fs ls but not with spark.read</title>
      <link>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78546#M9148</link>
      <description>&lt;P&gt;Update 1: Apparently the problem shows up when using display(), using show() or display(df.limit()) works fine. I also started using the premium pricing tier, I'm going to see what happens if I use the free 14 days trial pricing tier.&lt;/P&gt;&lt;P&gt;Update 2: I tried using dbfs: and /dbfs prefixes, still not working. I also tried using a table I got from the marketplace and spark.read.table() and the problem persists&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 14:29:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78546#M9148</guid>
      <dc:creator>joseroca99</dc:creator>
      <dc:date>2024-07-12T14:29:45Z</dc:date>
    </item>
    <item>
      <title>Re: File found with %fs ls but not with spark.read</title>
      <link>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78618#M9149</link>
      <description>&lt;P&gt;I think there is some kind of problem with networking/permissions to the storage account created in managed resource group by Databricks. By default, when you run a notebook interactively by clicking&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Run&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;in the notebook:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;If the results are small, they are stored in the Azure Databricks&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/getting-started/overview" target="_blank" rel="noopener"&gt;control plane&lt;/A&gt;, along with the notebook’s command contents and metadata.&lt;/LI&gt;&lt;LI&gt;Larger results are stored in the &lt;STRONG&gt;workspace storage account&lt;/STRONG&gt; in your Azure subscription. Azure Databricks automatically creates the workspace storage account. Azure Databricks uses this storage area for workspace system data and your workspace’s&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/dbfs/dbfs-root" target="_blank" rel="noopener"&gt;DBFS root&lt;/A&gt;. Notebook results are stored in workspace system data storage, which is not accessible by users.&amp;nbsp;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;So in your case, when you limit the result set then it works becasue small results are stored in Azure Databricks control plane.&lt;BR /&gt;But when you try to display whole datframe without limiting it, databricks will try to save result in the workspace storage account. Look at the cluster logs and see if there is some errors related to the root storage account.&lt;BR /&gt;Maybe you have some firewall that prevents Databricks to connect to storage account.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 22:01:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/78618#M9149</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2024-07-12T22:01:26Z</dc:date>
    </item>
    <item>
      <title>Re: File found with %fs ls but not with spark.read</title>
      <link>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/111910#M9150</link>
      <description>&lt;P&gt;I have the exact same issue. Seems like limiting the the display() method works as a temporary solution, but I wonder if there's any long term one. The idea would be to have the possibility of displaying larger datasets within a notebook. How to achieve that? Is there a permission that we can alter to make it work?&lt;/P&gt;</description>
      <pubDate>Thu, 06 Mar 2025 13:39:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/file-found-with-fs-ls-but-not-with-spark-read/m-p/111910#M9150</guid>
      <dc:creator>xx123</dc:creator>
      <dc:date>2025-03-06T13:39:23Z</dc:date>
    </item>
  </channel>
</rss>

