Need to move files from one Volume to other
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2024 12:41 AM
We recently enabled Unity catalog on our workspace, as part of certain transformations(Custom clustered Datapipelines(python)) we need to move file from one volume to other volume.
As the job itself runs on a service principal that has access to external storage, we don't want to pass in any credentials. Can we achieve this? We tried with OS, DbUtils, and Workspace client, all of which need service principal credentials. Finally the reading of volume we achieved through spark context itself, but moving of files we need other way, please help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2024 03:40 AM
You should be able to use dbutils.fs.cp to copy the file but you jus need to ensure that the SP has WRITE VOLUME permission on the destination Volume.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2024 03:47 AM
Thanks for that,
But I have a Python data pipeline running under a custom cluster, and its not working from there.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2024 03:51 AM
What is the error being received? And does the SP has the mentioned permission in UC?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-22-2024 11:24 PM
Hi @navi_bricks ,
It can be achieved by creating a new notebook and writing the db utils cp or mv command in that notebook. After that, you can create a workflow or an small independent ADF pipeline using the same SP which has the permission. It will run and move the files.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-24-2024 05:30 AM
Thanks for that @MujtabaNoori
Instead of using a notebook, can I use Workspaceclient part of databrick sdk and move the files?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-03-2025 02:45 PM
You can try:
from databricks.sdk import WorkspaceClient
# Initialize the WorkspaceClient
w = WorkspaceClient()
# Define source and destination paths
source_path = "/Volumes/<source_catalog>/<source_schema>/<source_volume>/<file_name>"
destination_path = "/Volumes/<destination_catalog>/<destination_schema>/<destination_volume>/<file_name>"
# Move the file
w.files.move(source_path, destination_path)
# Verify the file has been moved
for item in w.files.list_directory_contents(f"/Volumes/<destination_catalog>/<destination_schema>/<destination_volume>"):
print(item.path)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-15-2025 05:02 AM
Sorry for the late reply. I was on vacation and didn't check this out. I tried this but always got the error "default auth: cannot configure default credentials." I even tried to use it with client ID and secret being passed as arguments.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-15-2025 07:44 AM - edited 01-15-2025 07:45 AM
# Using Databricks CLI profile to access the workspace.
w = WorkspaceClient(profile="<<profilename>>")
OR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-15-2025 08:45 AM
Not all job clusters work well with Volumes. I used following type cluster to access files from Volume.

