- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2024 02:47 AM
Currently my Databricks looks like this:
I want to create volume to access external location. Where exactly should I create it? Should a create new schema in "poe" catalog and create a volume inside it or create it in a existing schema? What is the best practice?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2024 05:33 AM - edited 08-01-2024 05:35 AM
Hello! Volumes go inside of schemas (screenshot below). It's up to you how to keep your data organised, but a few considerations:
- If you're going to have lots of volumes, does it make sense to group them together?
- As it's raw data, it's probably categorised as 'bronze' data, you could consider keeping it with that
- Will you have to manage access to this data? Does it make sense to group it with other data you may want to prevent / promote access to?
- Do you want it to inherit other properties from things in the schema like tags, features or access patterns?
At the end of the day, schemas and catalogs are there to keep data organised. With external volumes (and tables) it has no baring on where the data is stored so it doesn't have much technical impact.
My messy demo example:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2024 05:33 AM - edited 08-01-2024 05:35 AM
Hello! Volumes go inside of schemas (screenshot below). It's up to you how to keep your data organised, but a few considerations:
- If you're going to have lots of volumes, does it make sense to group them together?
- As it's raw data, it's probably categorised as 'bronze' data, you could consider keeping it with that
- Will you have to manage access to this data? Does it make sense to group it with other data you may want to prevent / promote access to?
- Do you want it to inherit other properties from things in the schema like tags, features or access patterns?
At the end of the day, schemas and catalogs are there to keep data organised. With external volumes (and tables) it has no baring on where the data is stored so it doesn't have much technical impact.
My messy demo example:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2024 05:45 AM
Alright, thanks for your explanation. I have one more question, after creating a volume, how would you get it connect to a container? Imagine, you have created volume at external location a and you want to connect it external location b?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2024 06:57 AM
Hi hpant, each volume is mapped to one location only. If you need to get data from two different locations, you'd make two separate volumes and join them as part of your pipeline.
If you wanted to read in from one location and write to another, again, you'd do that with two separate volumes.
When I said 'group them together' above - you can have multiple volumes in one schema, even if their locations are very different.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2024 07:08 AM
Hey, thanks for your response. currently, I have my data in one of the container in azure which gets added to the container through azure factory pipeline.. I have created a unity catalog workspace in different resource group. It has a container but there is no data in it. I have created a volume in it. Now I want to connect the volume to the data present in a container of different storage account of different resource group. How can I make that connection? Do I need some sort of access key mechanism?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-05-2024 01:26 AM
Hi hpant,
You need to set up a new volume using a new external location (and potentially storage credential). Docs here: https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/sql-ref-external-locations
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-05-2024 02:01 AM
Hi @holly ,
Thanks so much for your response. I have one last question in this regard. Whenever I want add an extra location, ( external location), do I need to give Contributor role or higher on the access connector resource in Azure to add the storage credential first?
Thanks,
Hiamnshu Pant
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-05-2024 08:18 AM
The docs say 'Contributor or Owner of an Azure resource group' and I don't have any reason to contradict that
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-06-2024 01:38 AM
I am trying to do that but couldn't find the option.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-06-2024 01:44 AM
No, I don't have.

