cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Not able to unzip the zip file with mount and unity catalog

Gareema
New Contributor III

Hello Team, 

I have a zip file in ADLS Gen 2. The folder I am using is mounted and when I run command : dbutils.fs.ls(path) it lists all the files(including the zip require). 

However, when I try to read the zip using 'zipfile' module, it displays 'FileNotFoundError'. 

Gareema_0-1721743802964.png

I am able to copy this file to volume from the mount point and unzip it and it is working fine. 
I want to use mounts for the time being. The file size is around 3GB and I am using unity catalog. 

I have tried using the method to convert zip to binary and then read, however there is a limitation of size in that case. 

My requirement is just to unzip the files after reading the metadata of the files before unzipping.

3 REPLIES 3

Gareema
New Contributor III

Thanks @Retired_mod . 

It is true that os.path.isdir(path) returns false, which is the major concern as when I try to use dbutils or just read the whole zip as a text file, it doesn't give an issue with the path. 
example:

Gareema_0-1721977616160.png


but when I try to use it otherwise it fails with the message 'path not found'. 
I am using dbfs mount points and adls gen2 that is mounted to cluster.
so my path looks something like this: '/dbfs/mnt/<container-name>/<folder-name>'

Can this be a problem with the mounting or Unity Catalog? How could the same path be accessible to half features and half doesn't work?

Witold
Contributor III

@Gareema, since you're using UC, can you use Volumes instead? It basically replaces the old mount approach.

Gareema
New Contributor III

@Witold, I know but there is a business limitation to that, so we currently can't move that. and till we move to volumes, some legacy data will be there so can't drop that as well.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group