<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to manipulate files in an external location? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3011#M210</link>
    <description>&lt;P&gt;The main problem was related to the network configuration of the storage account: Databricks did not have access. Quite strange that it did manage to create folders...&lt;/P&gt;&lt;P&gt;Currently dbutils.fs functionality is working.&lt;/P&gt;&lt;P&gt;For the zipfile manipulation: that only works with local (or mounted) locations.&lt;/P&gt;&lt;P&gt;Workaround: copy to/from local storage to abfss when required&lt;/P&gt;</description>
    <pubDate>Mon, 19 Jun 2023 12:50:12 GMT</pubDate>
    <dc:creator>Tjomme</dc:creator>
    <dc:date>2023-06-19T12:50:12Z</dc:date>
    <item>
      <title>How to manipulate files in an external location?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3004#M203</link>
      <description>&lt;P&gt;According to the documentation, the usage of external locations is preferred over the use of mount points.&lt;/P&gt;&lt;P&gt;Unfortunately the basic funtionality to manipulate files seems to be missing.&lt;/P&gt;&lt;P&gt;This is my scenario:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;create a download folder in an external location if it does not exist: &lt;/LI&gt;&lt;/UL&gt;&lt;PRE&gt;&lt;CODE&gt;dbutils.fs.mkdirs(NewPath) does not work --&amp;gt; Operation failed: "This request is not authorized to perform this operation."&lt;/CODE&gt;&lt;/PRE&gt;&lt;UL&gt;&lt;LI&gt;use API to download zip files from a source and write it to a mounted location using:&lt;B&gt;&amp;nbsp;&lt;/B&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;PRE&gt;&lt;CODE&gt;f = open(fullFileName, 'w+b') --&amp;gt; FileNotFoundError: [Errno 2] No such file or directory
f.write(ZipBinaryData)
f.close()&lt;/CODE&gt;&lt;/PRE&gt;&lt;UL&gt;&lt;LI&gt;loop all zip files to: &lt;B&gt;dbutils.fs.ls does not work: needs to be replaced with LIST&lt;/B&gt;&lt;UL&gt;&lt;LI&gt;unzip them into an extract folder containing JSON files (not tested yet, but using &lt;B&gt;zipfile.ZipFile(fullZipFileName)&lt;/B&gt; )&lt;/LI&gt;&lt;LI&gt;load the JSON files into a (raw) managed table (should not be an issue)&lt;/LI&gt;&lt;LI&gt;further process the managed table (should not be an issue)&lt;/LI&gt;&lt;LI&gt;empty extract folder using &lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;PRE&gt;&lt;CODE&gt;dbutils.fs.rm(NewPath,True) --&amp;gt; Operation failed: "This request is not authorized to perform this operation."&lt;/CODE&gt;&lt;/PRE&gt;&lt;UL&gt;&lt;LI&gt;move zip file to archive folder using &lt;/LI&gt;&lt;/UL&gt;&lt;PRE&gt;&lt;CODE&gt;dbutils.fs.mv(NewPath,ArchivePathTrue) --&amp;gt; Operation failed: "This request is not authorized to perform this operation."&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Any help or insights on how to get this working with external locations is greatly appreciated!&lt;/P&gt;</description>
      <pubDate>Thu, 15 Jun 2023 12:56:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3004#M203</guid>
      <dc:creator>Tjomme</dc:creator>
      <dc:date>2023-06-15T12:56:28Z</dc:date>
    </item>
    <item>
      <title>Re: How to manipulate files in an external location?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3005#M204</link>
      <description>&lt;P&gt;Sounds like a cloud provider permission issue. Which one are you using? Aws or Azure? How are you connecting to blob? Via external location with managed identity or sas token? The easiest method to test connectivity is to click test connection within the external location tab within "data" (bottom left). If that is successful you should test a simple read of the file directory...&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;dbutils.fs.ls("&amp;lt;blob url&amp;gt;")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2023 02:20:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3005#M204</guid>
      <dc:creator>etsyal1e2r3</dc:creator>
      <dc:date>2023-06-16T02:20:29Z</dc:date>
    </item>
    <item>
      <title>Re: How to manipulate files in an external location?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3006#M205</link>
      <description>&lt;P&gt;Hi @Tjomme Vergauwen​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We haven't heard from you since the last response from @Tyler Retzlaff​&amp;nbsp;​, and I was checking back to see if her suggestions helped you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Or else, If you have any solution, please share it with the community, as it can be helpful to others.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2023 03:38:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3006#M205</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-06-16T03:38:59Z</dc:date>
    </item>
    <item>
      <title>Re: How to manipulate files in an external location?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3007#M206</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We're using Azure.&lt;/P&gt;&lt;P&gt;External locations are created using a managed identity.&lt;/P&gt;&lt;P&gt;It's not a security issue as demonstrated below:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/75i4AEC0C2F33F71E86/image-size/large?v=v2&amp;amp;px=999" role="button" title="image" alt="image" /&gt;&lt;/span&gt;Same folder, different syntax to get the list of files. The first one works, the second one throws an error.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;LIST 'abfss://landingzone@***.dfs.core.windows.net/DEV' --&amp;gt; works
&amp;nbsp;
%py
dbutils.fs.ls('abfss://landingzone@***.dfs.core.windows.net/DEV') --&amp;gt; throws error&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2023 06:43:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3007#M206</guid>
      <dc:creator>Tjomme</dc:creator>
      <dc:date>2023-06-16T06:43:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to manipulate files in an external location?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3008#M207</link>
      <description>&lt;P&gt;Thats really weird... can you go into the external location in databricks' data tab and make sure your user has the right permissions?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2023 10:47:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3008#M207</guid>
      <dc:creator>etsyal1e2r3</dc:creator>
      <dc:date>2023-06-16T10:47:26Z</dc:date>
    </item>
    <item>
      <title>Re: How to manipulate files in an external location?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3009#M208</link>
      <description>&lt;P&gt;it seems my access rights on the storage account are in order, but the ones on the container are missing. Reference: &lt;A href="https://stackoverflow.com/questions/74332230/databricks-unitycatalog-create-table-fails-with-failed-to-acquire-a-sas-token-u" alt="https://stackoverflow.com/questions/74332230/databricks-unitycatalog-create-table-fails-with-failed-to-acquire-a-sas-token-u" target="_blank"&gt;DataBricks UnityCatalog create table fails with "Failed to acquire a SAS token UnauthorizedAccessException: PERMISSION_DENIED: request not authorized" - Stack Overflow&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I'll have this changed and retry&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2023 14:06:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3009#M208</guid>
      <dc:creator>Tjomme</dc:creator>
      <dc:date>2023-06-16T14:06:05Z</dc:date>
    </item>
    <item>
      <title>Re: How to manipulate files in an external location?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3010#M209</link>
      <description>&lt;P&gt;Cool, let me know how it goes&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2023 15:25:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3010#M209</guid>
      <dc:creator>etsyal1e2r3</dc:creator>
      <dc:date>2023-06-16T15:25:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to manipulate files in an external location?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3011#M210</link>
      <description>&lt;P&gt;The main problem was related to the network configuration of the storage account: Databricks did not have access. Quite strange that it did manage to create folders...&lt;/P&gt;&lt;P&gt;Currently dbutils.fs functionality is working.&lt;/P&gt;&lt;P&gt;For the zipfile manipulation: that only works with local (or mounted) locations.&lt;/P&gt;&lt;P&gt;Workaround: copy to/from local storage to abfss when required&lt;/P&gt;</description>
      <pubDate>Mon, 19 Jun 2023 12:50:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manipulate-files-in-an-external-location/m-p/3011#M210</guid>
      <dc:creator>Tjomme</dc:creator>
      <dc:date>2023-06-19T12:50:12Z</dc:date>
    </item>
  </channel>
</rss>

