<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Upload file from local file system to Unity Catalog Volume (via databricks-connect) in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/65325#M32784</link>
    <description>&lt;P&gt;Late to the discussion, but I too was looking for a way to do this _programmatically_, as opposed to the UI.&lt;/P&gt;&lt;P&gt;The solution I landed on was using the Python SDK (though you could assuredly do this using an API request instead if you're not in Python):&lt;/P&gt;&lt;LI-CODE lang="python"&gt;w = WorkspaceClient()
w.files.upload('/your/volume/path/foo.txt', 'foo bar')&lt;/LI-CODE&gt;</description>
    <pubDate>Tue, 02 Apr 2024 18:20:56 GMT</pubDate>
    <dc:creator>lathaniel</dc:creator>
    <dc:date>2024-04-02T18:20:56Z</dc:date>
    <item>
      <title>Upload file from local file system to Unity Catalog Volume (via databricks-connect)</title>
      <link>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/59669#M31470</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Context:&lt;/STRONG&gt;&lt;BR /&gt;IDE: IntelliJ 2023.3.2&lt;BR /&gt;Library: databricks-connect 13.3&lt;BR /&gt;Python: 3.10&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Description:&lt;/STRONG&gt;&lt;BR /&gt;&lt;SPAN&gt;I develop notebooks and python scripts locally in the IDE and I connect to the spark cluster via databricks-connect for a better developer experience.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;I download a file from the public internet and I want to store it in an external Unity Catalog Volume (hosted on S3). I would like to upload the file using a volume path and &lt;STRONG&gt;not&lt;/STRONG&gt; directly uploading it to S3 via AWS Credentials.&lt;/P&gt;&lt;P&gt;Everything works fine using a Databricks Notebook:&lt;BR /&gt;E.g.:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;dbutils.fs.cp("&amp;lt;local/file/path&amp;gt;", "/Volumes/&amp;lt;path&amp;gt;")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;or:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;source_file = ...
with open("/Volumes/&amp;lt;path&amp;gt;", 'wb') as destination_file:
    destination_file.write(source_file)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can't figure out a way to do that in my IDE locally.&amp;nbsp;&lt;BR /&gt;Using dbutils:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;dbutils.fs.cp("file:/&amp;lt;local/path&amp;gt;", "/Volumes/&amp;lt;path&amp;gt;")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I get the error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;databricks.sdk.errors.mapping.InvalidParameterValue: Path must be absolute: \Volumes\&amp;lt;path&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Using python's &lt;EM&gt;with&lt;/EM&gt; statement won't work, because the Unity Catalog Volume is not mounted to my local machine.&lt;BR /&gt;&lt;BR /&gt;Is there a way to upload files from the local machine or memory into Unity Catalog Volumes?&lt;/P&gt;</description>
      <pubDate>Thu, 08 Feb 2024 10:16:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/59669#M31470</guid>
      <dc:creator>Husky</dc:creator>
      <dc:date>2024-02-08T10:16:59Z</dc:date>
    </item>
    <item>
      <title>Re: Upload file from local file system to Unity Catalog Volume (via databricks-connect)</title>
      <link>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/59908#M31531</link>
      <description>&lt;P&gt;Thanks for your answer. But I want to upload the files/data programmatically and not manually with the Databricks UI.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Feb 2024 09:52:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/59908#M31531</guid>
      <dc:creator>Husky</dc:creator>
      <dc:date>2024-02-12T09:52:24Z</dc:date>
    </item>
    <item>
      <title>Re: Upload file from local file system to Unity Catalog Volume (via databricks-connect)</title>
      <link>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/65325#M32784</link>
      <description>&lt;P&gt;Late to the discussion, but I too was looking for a way to do this _programmatically_, as opposed to the UI.&lt;/P&gt;&lt;P&gt;The solution I landed on was using the Python SDK (though you could assuredly do this using an API request instead if you're not in Python):&lt;/P&gt;&lt;LI-CODE lang="python"&gt;w = WorkspaceClient()
w.files.upload('/your/volume/path/foo.txt', 'foo bar')&lt;/LI-CODE&gt;</description>
      <pubDate>Tue, 02 Apr 2024 18:20:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/65325#M32784</guid>
      <dc:creator>lathaniel</dc:creator>
      <dc:date>2024-04-02T18:20:56Z</dc:date>
    </item>
    <item>
      <title>Re: Upload file from local file system to Unity Catalog Volume (via databricks-connect)</title>
      <link>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/68188#M33580</link>
      <description>&lt;P&gt;Thanks, that's what I was looking for.&lt;/P&gt;&lt;P&gt;Even though it would be nice to not read the binary but to provide just the path to the file to upload.&lt;/P&gt;</description>
      <pubDate>Mon, 06 May 2024 09:30:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/68188#M33580</guid>
      <dc:creator>Husky</dc:creator>
      <dc:date>2024-05-06T09:30:16Z</dc:date>
    </item>
    <item>
      <title>Re: Upload file from local file system to Unity Catalog Volume (via databricks-connect)</title>
      <link>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/68818#M33750</link>
      <description>&lt;P&gt;Hey Husky,&lt;/P&gt;
&lt;P&gt;You can provide just the path to the file to upload with REST Api call. &lt;A href="https://docs.databricks.com/api/workspace/files/upload" target="_blank"&gt;https://docs.databricks.com/api/workspace/files/upload&lt;/A&gt;. Its in Public Preview. Please see below.&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;def return_ws_url():
    workspace_url = dbutils.notebook.entry_point.getDbutils().notebook().getContext().tags().get("browserHostName")
    match = re.match(r'Some\((.*)\)', str(workspace_url))
    if match:
      value = match.group(1)
      return(value)
    else:
        print("No value found")

def upload_ws_file_to_volume(local_path, remote_path):
  with open(local_path, 'rb') as f:
    r = requests.put(
      'https://{databricks_instance}/api/2.0/fs/files{path}'.format(
        databricks_instance=return_ws_url(), path=remote_path),
      headers=headers,
      data=f)
    r.raise_for_status()

headers = {'Authorization' : 'Bearer {}'.format(dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get())}
print(headers)

upload_ws_file_to_volume(&amp;lt;&amp;lt;Your source file local path&amp;gt;&amp;gt;, &amp;lt;&amp;lt;UC Volume path&amp;gt;&amp;gt;)&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 11 May 2024 18:59:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/upload-file-from-local-file-system-to-unity-catalog-volume-via/m-p/68818#M33750</guid>
      <dc:creator>dkushari</dc:creator>
      <dc:date>2024-05-11T18:59:40Z</dc:date>
    </item>
  </channel>
</rss>

