Volumes unzip files
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2024 06:18 AM
I have this shell unzip that I use to unzip files
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper sudo: a password is required Reading package lists... E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied) E: Unable to lock directory /var/lib/apt/lists/ E: Could not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied) E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root? bash: line 7: 7z: command not foundHow could I unzip files with volumes, If I already have them there ?
should I make a shell init script to do that or how ?
The files are password protected
The unzip that worked
%sh for file in /dbfs/mnt/zip/$source/*.zip do 7z x "$file" -p$pw -o/dbfs/mnt/zip/$source/unzipped/ -y done
Any good ideas welcome 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-27-2024 09:53 AM
Thank you for your question! Have you tried using a cluster init script to install p7zip automatically when the cluster starts? This avoids the need for sudo during your session.
Alternatively, if unzip is already available, you can modify your script like this:
%sh
for file in /dbfs/mnt/zip/$source/*.zip
do
unzip -P "$pw" "$file" -d /dbfs/mnt/zip/$source/unzipped/
done
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-27-2024 08:12 PM
First, you can read the ZIP file in a binary format [ spark.read.format("binaryFile") ], then use the zipfile Python package to unzip and extract all the files from the zipped file and store them in a Volume.
Data Architect | MS/MBA
Data + AI/ML/GenAI
17x Databricks Credentials

