cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

graceful dbutils mount/unmount

dchokkadi1_5588
New Contributor II

Is there a way to indicate to dbutils.fs.mount to not throw an error if the mount is already mounted?

And viceversa, for unmount to not throw an error if it is already unmounted?

I am trying to run my notebook as a job and it has a init section that mounts S3 buckets it needs. Sometimes the mounts are already done by an earlier script.

Since mounting an already mounted mount (wow) throws an error my job exits out.

1 ACCEPTED SOLUTION

Accepted Solutions

Bill_Chambers
Contributor II

@Deepak Chokkadi

This is the function that I use:

def mountBucket(dstBucketName:String, dstMountName:String) {
  import java.lang.IllegalArgumentException
  val accessKey = "YOUR ACCESS KEY"
  val encodedSecretKey = "YOUR SECRET".replace("/", "%2F")
  try {
    dbutils.fs.mount(s"s3a://$accessKey:$encodedSecretKey@$dstBucketName", dstMountName) 
    println("All done!")
  } catch {
    case e: java.rmi.RemoteException => {
      println("Directory is Already Mounted")
      dbutils.fs.unmount(dstMountName)
      mountBucket(dstBucketName, dstMountName)
    }
    case e: Exception => {
      println("There was some other error")
    }
  }
}

I've put it in a simple accessible notebook, then just run that notebook using %run. Then to mount a bucket I use that function and it automatically remount it.

View solution in original post

8 REPLIES 8

Bill_Chambers
Contributor II

@Deepak Chokkadi

This is the function that I use:

def mountBucket(dstBucketName:String, dstMountName:String) {
  import java.lang.IllegalArgumentException
  val accessKey = "YOUR ACCESS KEY"
  val encodedSecretKey = "YOUR SECRET".replace("/", "%2F")
  try {
    dbutils.fs.mount(s"s3a://$accessKey:$encodedSecretKey@$dstBucketName", dstMountName) 
    println("All done!")
  } catch {
    case e: java.rmi.RemoteException => {
      println("Directory is Already Mounted")
      dbutils.fs.unmount(dstMountName)
      mountBucket(dstBucketName, dstMountName)
    }
    case e: Exception => {
      println("There was some other error")
    }
  }
}

I've put it in a simple accessible notebook, then just run that notebook using %run. Then to mount a bucket I use that function and it automatically remount it.

DonatienTessier
New Contributor III

On my side, I tested if the mount point existed before mounted it:

if (!dbutils.fs.mounts.map(mnt => mnt.mountPoint).contains("/mnt/<directory>"))
  dbutils.fs.mount(
    source = "adl://<datalake_name>.azuredatalakestore.net/<directory>",
    mountPoint = s"/mnt/<directory>",
    extraConfigs = configs)    

Very nice! This is an equivalent if statement in Python:

if any(mount.mountPoint == '/mnt/<directory>' for mount in dbutils.fs.mounts()):

With a not for python no?

if not any(mount.mountPoint == mountPoint for mount in dbutils.fs.mounts()):

__NikolajPurup
New Contributor II

For python, you could do something like this:

mountName = 'abc'

mounts = [str(i) for i in dbutils.fs.ls('/mnt/')] if "FileInfo(path='dbfs:/mnt/" +mountName + "/', name='" +mountName + "/', size=0)" in mounts: print(mountName + " has already been mounted") else: dbutils.fs.mount( source = "wasbs://"+mountName+"@<datalake_name>.blob.core.windows.net/", mount_point = "/mnt/" + mountName, extra_configs = {"fs.azure.sas."+ mountName +".<datalake_name>.blob.core.windows.net":dbutils.secrets.get(scope = "<secret_scope>", key = "<key_name>")})

viswanathboga
New Contributor II

Is there a way to mount a drive with Databricks CLI, I want the drive to be present from the time the cluster boots up.. I want to use a mounted blob storage to redirect the logs.

DonatienTessier
New Contributor III

Hi,

I guess you should create an init script that will be run when the cluster starts.

I asked the question here:

https://forums.databricks.com/questions/17305/mount-blob-storage-with-init-scripts.html

Mariano_IrvinLo
New Contributor II

If you use scala to mount a gen 2 data lake you could try something like this

/Gather relevant Keys/

var ServicePrincipalID = ""

var ServicePrincipalKey = ""

var DirectoryID = ""

/Create configurations for our connection/

var configs = Map ("fs.azure.account.auth.type" -> "OAuth",

"fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider", "fs.azure.account.oauth2.client.id" -> ServicePrincipalID,

"fs.azure.account.oauth2.client.secret" -> ServicePrincipalKey, "fs.azure.account.oauth2.client.endpoint" -> DirectoryID)

// Optionally, you can add <directory-name> to the source URI of your mount point.

if (dbutils.fs.mounts.map(mnt => mnt.mountPoint).contains("/mnt/ventas")){

"already mount"

}else{

dbutils.fs.mount( source = "/", mountPoint = "/mnt/", extraConfigs = configs) }

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.