Greetings @mtaraviya-QA , Hereโs how to configure your interactive Databricks compute to access files in AWS EFS.
Prerequisites on AWS networking
- Ensure the Databricks cluster VPC/subnets can reach EFS mount targets. Place EFS mount targets in subnets reachable from your cluster, and open NFS (TCP 2049) in the relevant security groups.
-
If EFS is in a different VPC or AWS account, set up VPC peering or a transit gateway and routing between VPCs.
-
Enable VPC DNS resolution and hostnames and test reachability from the cluster network (for example, nslookup fs-xxxx.efs.<region>.amazonaws.com and nc -vz <efs-hostname> 2049). If crossโVPC DNS is not available, plan to mount using the mount target IP address.
Configure the Databricks cluster (API-based mount)
Mount EFS via the Clusters API using the experimental cluster_mount_infos field. Do not use init scripts for EFS on typical shared/E2 workspaces.
-
Create or edit your cluster to include cluster_mount_infos; example: json
{
"cluster_name": "efs-cluster",
"spark_version": "15.4.x-scala2.12",
"node_type_id": "i3.xlarge",
"num_workers": 2,
"cluster_mount_infos": [
{
"network_filesystem_info": {
"server_address": "fs-abcdef0123456789.efs.us-east-1.amazonaws.com",
"mount_options": "nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport"
},
"local_mount_dir_path": "/mnt/volumes/efs-mount",
"remote_mount_dir_path": "/"
}
]
}
-
If DNS doesnโt resolve (common in crossโVPC setups), use the mount target IP for server_address. Optionally pin the clusterโs AZ with aws_attributes.zone_id to match the mount targetโs AZ.
-
Access the mount at the path you specified in local_mount_dir_path (for example, /mnt/volumes/efs-mount). In some environments mounts are presented under /db-mnt/...; if you donโt see your path at the root, check under /db-mnt.
Terraform example
hcl
resource "databricks_cluster" "with_efs" {
# ...
cluster_mount_info {
network_filesystem_info {
server_address = "fs-abcdef0123456789.efs.us-east-1.amazonaws.com"
mount_options = "nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport"
}
remote_mount_dir_path = "/"
local_mount_dir_path = "/mnt/volumes/efs-mount"
}
}
### Verify from a notebook * Check the mount and list files: bash
%sh
mount | grep efs
ls -la /mnt/volumes/efs-mount
- Access files via POSIX paths:
python
with open("/mnt/volumes/efs-mount/somefile.txt") as f:
print(f.readline())
Important notes and limitations
- Init scripts are not the supported method for mounting EFS on shared/E2 workspaces; use the Clusters API (
cluster_mount_infos) or Terraform.
-
The amazon-efs-utils IAM/TLS mount helper is not supported in this integration. Use NFSv4.1 with standard mount options (as shown above).
-
After cluster edits or restarts, ensure the mount configuration remains in the cluster definition; avoid editing mounts in the UI, as custom properties can be lost. If the cluster autoscale adds nodes, Databricks will apply the configured mount during node setup.
-
This applies to classic clusters. For serverless compute, use S3 or Unity Catalog Volumes rather than EFS.
Troubleshooting checklist
- From a sameโsubnet test EC2 instance (with the same security group rules), try:
bash
sudo apt-get update && sudo apt-get install -y nfs-common
sudo mkdir /efs
sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport <efs-host-or-ip>:/ /efs
mount | grep efs
This validates network, security group, and DNS routing independently of Databricks.
-
If DNS fails, use the mount target IP and pin the cluster AZ. Confirm security groups allow TCP 2049 and that routing to peering/TGW is correct.
-
If youโre using RStudio on Databricks, EFS via cluster_mount_infos is a good way to persist user data; ensure the cluster can write to the mount (for example, chmod a+w /mnt/volumes/efs-mount).
Hope this helps, Louis.