<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to upload large files to Databricks? and how to unzip files successfully? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10350#M5547</link>
    <description>&lt;P&gt;Hi @Sage Olson​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hope everything is going great.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 09 Apr 2023 04:04:55 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2023-04-09T04:04:55Z</dc:date>
    <item>
      <title>How to upload large files to Databricks? and how to unzip files successfully?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10345#M5542</link>
      <description>&lt;P&gt;I have two JSON files, one ~3 gb and one ~5 gb. I am unable to upload them to databricks community edition as they exceed the max allowed up-loadable file size (~2 gb). &lt;/P&gt;&lt;P&gt;If I zip them I am able to upload them, but I am also having issues figuring out how to unzip the files into a readable format, currently it's only outputting unreadable characters in the import preview. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm relatively new to Databricks, just using it for a SQL certification, so I'd like to import the JSON into a query-able table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks. &lt;/P&gt;</description>
      <pubDate>Wed, 01 Feb 2023 05:02:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10345#M5542</guid>
      <dc:creator>Sagacious</dc:creator>
      <dc:date>2023-02-01T05:02:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to upload large files to Databricks? and how to unzip files successfully?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10346#M5543</link>
      <description>&lt;P&gt;@Sage Olson​&amp;nbsp; instead of uploading in databricks you can use any cloud provider  and  dump your data there and then read file from using databricks , it is safe &lt;/P&gt;</description>
      <pubDate>Wed, 01 Feb 2023 05:08:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10346#M5543</guid>
      <dc:creator>Aviral-Bhardwaj</dc:creator>
      <dc:date>2023-02-01T05:08:23Z</dc:date>
    </item>
    <item>
      <title>Re: How to upload large files to Databricks? and how to unzip files successfully?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10348#M5545</link>
      <description>&lt;P&gt;Thanks for your kind response. I've already found the article on shell commands and the unzipping information, however I just don't have the python background yet to set this up with just the documentation to go off of. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I understand that I need to set up the %sh command at the beginning, but I don't understand what to do with the "import" block of code. Where is that data being put? I can follow the notebook setup template after I can locate where the unzipped data is going via that import/unzip command.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Feb 2023 05:45:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10348#M5545</guid>
      <dc:creator>Sagacious</dc:creator>
      <dc:date>2023-02-01T05:45:04Z</dc:date>
    </item>
    <item>
      <title>Re: How to upload large files to Databricks? and how to unzip files successfully?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10349#M5546</link>
      <description>&lt;P&gt;After uploading the zip, copy the path to it from UI and unzip with something similar to:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import zipfile
import io
import os
&amp;nbsp;
zip_file = "/dbfs/tmp/tmp.zip"
with zipfile.ZipFile(zip_file, "r") as z:
&amp;nbsp;
    for filename in z.namelist():
&amp;nbsp;
        with z.open(filename) as f:
&amp;nbsp;
            extracted_file = os.path.join("/dbfs/tmp/", filename)
            with open(extracted_file, "wb") as output:
                output.write(f.read())&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Feb 2023 09:40:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10349#M5546</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2023-02-01T09:40:43Z</dc:date>
    </item>
    <item>
      <title>Re: How to upload large files to Databricks? and how to unzip files successfully?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10350#M5547</link>
      <description>&lt;P&gt;Hi @Sage Olson​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hope everything is going great.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 09 Apr 2023 04:04:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10350#M5547</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-04-09T04:04:55Z</dc:date>
    </item>
    <item>
      <title>Re: How to upload large files to Databricks? and how to unzip files successfully?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10347#M5544</link>
      <description>&lt;P&gt;Hi, You can create a notebook inside a Databricks cluster and unzip the files using linux commands in the notebook, please refer: &lt;A href="https://docs.databricks.com/notebooks/notebooks-code.html" alt="https://docs.databricks.com/notebooks/notebooks-code.html" target="_blank"&gt;https://docs.databricks.com/notebooks/notebooks-code.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Also, while after entering the command, please run the notebook in PYTHON mode and start the notebook cell with %sh which will pick up the commands as shell commands and unzip the file. &lt;/P&gt;&lt;P&gt;For unzipping you can refer to : &lt;A href="https://docs.databricks.com/files/unzip-files.html" alt="https://docs.databricks.com/files/unzip-files.html" target="_blank"&gt;https://docs.databricks.com/files/unzip-files.html&lt;/A&gt; and &lt;A href="https://community.databricks.com/s/question/0D58Y00009az9bGSAQ/unzip-files" alt="https://community.databricks.com/s/question/0D58Y00009az9bGSAQ/unzip-files" target="_blank"&gt;https://community.databricks.com/s/question/0D58Y00009az9bGSAQ/unzip-files&lt;/A&gt; .&lt;/P&gt;</description>
      <pubDate>Wed, 01 Feb 2023 05:19:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-upload-large-files-to-databricks-and-how-to-unzip-files/m-p/10347#M5544</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2023-02-01T05:19:45Z</dc:date>
    </item>
  </channel>
</rss>

