cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Hi Guys Is there any documentation on where the /databricks-datasets/ mount is actually served from?We are looking at locking down where our workspace...

Confused
New Contributor III

Hi Guys

Is there any documentation on where the /databricks-datasets/ mount is actually served from?

We are looking at locking down where our workspace can reach out to via the internet and as it currently stands we are unable to reach this.

I did look at the mounts() command but it didn't give a URL etc and checked the readme in the root and it didn't have any useful info.

Thanks

Mat

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III

It is in dbfs default storage:

/dbfs/databricks-datasets

dbfs:/databricks-datasets

on console it is in Data, not Workspace. To test you can use:

dbutils.fs.ls('/databricks-datasets')

(in dbutils no need to add dbfs folder as it is only for that folder)

When you use Rest API Workspace is for notebooks, libraries etc not for dbfs. To get access to dbfs storage you need to use DBFS API https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/dbfs

Confused
New Contributor III

Hi Hubert

Thanks, but I guess I didn't explain the issue well, we were looking to find out what was mounted to that location. We've got it now, it goes off to an S3 container in AWS.

Cheers

Mat

jose_gonzalez
Moderator
Moderator

Hi @Mathew Walters​ ,

The mount point will be created dynamically according to the cloud provider that you are accessing from. For example, if you are using AWS, then the mount point will grab the data from a S3 location.

If you execute the following command in your notebook "display(dbutils.fs.mounts())" you will be able to find the source and the mountpoint it maps to.

Confused
New Contributor III

Hi @Jose Gonzalez​ 

That's not correct, for us at least, we are using Azure and the mount goes off to AWS S3. It also doesn't give us the information when we run that command, it was the first thing I tried:

imageAs you can see, it doesn't give us the real location, even though /mnt/training does (and is in Azure).

We tracked it down to S3 using firewall logs in the end.

Cheers

Mat

Hubert-Dudek
Esteemed Contributor III

yes now I remember that on Azure I had it also on S3 and I deleted content. It is some databricks education repo on s3 s3a://databricks-datasets-oregon. You can see it in logs with more details. No idea how to get rid off it:

21/11/18 21:21:33 INFO DatabricksMountsStore: Updated mounts cache. Changes: List(, (+,DbfsMountPoint(s3a://databricks-datasets-oregon/, /databricks-datasets)), (+,DbfsMountPoint(unsupported-access-mechanism-for-path--use-mlflow-client:/, /databricks/mlflow-tracking)), (+,DbfsMountPoint(wasbs://dbstoragexxx.blob.core.windows.net/***, /databricks-results)), (+,DbfsMountPoint(unsupported-access-mechanism-for-path--use-mlflow-client:/, /databricks/mlflow-registry))

Anonymous
Not applicable

Hello Mat,

Thanks for letting us know. Would you be happy to mark your answer as best if that will solve the problem for others? That way, members will be able to find the solution more easily. 🙂

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group