cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks reading from a zip file

tariq
New Contributor III

I have mounted an Azure Blob Storage in the Azure Databricks workspace filestore. The mounted container has zipped files with csv files in them. What is the best way to read the zipped files and write into a delta table?

@sasikumar sagabalaโ€‹ 

2 REPLIES 2

Debayan
Databricks Employee
Databricks Employee

Hi @Tarique Anwarโ€‹ , Hadoop does not have support for zip files as a compression codec. While a text file in GZip, BZip2, and other supported compression formats can be configured to be automatically decompressed in Apache Spark as long as it has the right file extension, you must perform additional steps to read zip files.

The following notebooks show how to read zip files. After you download a zip file to a temp directory, you can invoke the Azure Databricks 

%sh zip

 magic command to unzip the file. For the sample file used in the notebooks, the tail step removes a comment line from the unzipped file.

Please refer: https://learn.microsoft.com/en-us/azure/databricks/external-data/zip-files

Please let us know if this helps.

Rishitha
New Contributor III

Hello @Debayan  I recently came across the similar scenario, is there a way to do this via autoloader. We have zip Folders added daily to our AWS S3 bucket and we want to be able to unzip and load the csv files continuously (Autoloading)

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group