11-16-2021 01:43 AM
Hi Guys
Is there any documentation on where the /databricks-datasets/ mount is actually served from?
We are looking at locking down where our workspace can reach out to via the internet and as it currently stands we are unable to reach this.
I did look at the mounts() command but it didn't give a URL etc and checked the readme in the root and it didn't have any useful info.
Thanks
Mat
11-16-2021 03:19 AM
It is in dbfs default storage:
/dbfs/databricks-datasets
dbfs:/databricks-datasets
on console it is in Data, not Workspace. To test you can use:
dbutils.fs.ls('/databricks-datasets')
(in dbutils no need to add dbfs folder as it is only for that folder)
When you use Rest API Workspace is for notebooks, libraries etc not for dbfs. To get access to dbfs storage you need to use DBFS API https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/dbfs
11-16-2021 03:52 AM
Hi Hubert
Thanks, but I guess I didn't explain the issue well, we were looking to find out what was mounted to that location. We've got it now, it goes off to an S3 container in AWS.
Cheers
Mat
11-17-2021 10:13 AM
Hi @Mathew Walters ,
The mount point will be created dynamically according to the cloud provider that you are accessing from. For example, if you are using AWS, then the mount point will grab the data from a S3 location.
If you execute the following command in your notebook "display(dbutils.fs.mounts())" you will be able to find the source and the mountpoint it maps to.
11-17-2021 11:44 PM
Hi @Jose Gonzalez
That's not correct, for us at least, we are using Azure and the mount goes off to AWS S3. It also doesn't give us the information when we run that command, it was the first thing I tried:
As you can see, it doesn't give us the real location, even though /mnt/training does (and is in Azure).
We tracked it down to S3 using firewall logs in the end.
Cheers
Mat
11-18-2021 01:29 PM
yes now I remember that on Azure I had it also on S3 and I deleted content. It is some databricks education repo on s3 s3a://databricks-datasets-oregon. You can see it in logs with more details. No idea how to get rid off it:
21/11/18 21:21:33 INFO DatabricksMountsStore: Updated mounts cache. Changes: List(, (+,DbfsMountPoint(s3a://databricks-datasets-oregon/, /databricks-datasets)), (+,DbfsMountPoint(unsupported-access-mechanism-for-path--use-mlflow-client:/, /databricks/mlflow-tracking)), (+,DbfsMountPoint(wasbs://dbstoragexxx.blob.core.windows.net/***, /databricks-results)), (+,DbfsMountPoint(unsupported-access-mechanism-for-path--use-mlflow-client:/, /databricks/mlflow-registry))
11-18-2021 11:50 AM
Hello Mat,
Thanks for letting us know. Would you be happy to mark your answer as best if that will solve the problem for others? That way, members will be able to find the solution more easily. 🙂
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group