The error youโre seeing โ MongoTimeoutException referencing localhost:27017 โ suggests your Databricks cluster is trying to connect to MongoDB using the wrong address or that it cannot properly reach the MongoDB cluster endpoint from the notebook, even though telnet works from a shell command.
Immediate Issues and Solutions
-
Wrong Host in Connection String:
The error log shows localhost:27017, which is almost always incorrect when connecting from Databricks to a cloud MongoDB cluster. The connection string in your Spark configuration or notebook is likely defaulting to localhost, which refers to the Databricks node, not your MongoDB cluster. Compass might connect because itโs running from your local machine, where youโve specified the correct MongoDB URI.
-
Network Connectivity:
-
Telnet confirms the network route, but Spark jobs run on worker nodes, which might have different networking rules. Also, running %sh telnet uses the driver, not the Spark executors, so it is not a definitive test for all nodes in the cluster.
-
Firewall/Security Groups:
-
Even if telnet works from the driver, your MongoDB Atlas or Azure firewall may be blocking traffic from Databricks worker pools. Double-check your IP allowlist or VNet/NSG rules for MongoDB.
Troubleshooting Steps
1. Check and Correct the Connection String
Make sure your Spark configuration uses the full MongoDB URI, not localhost. Example Spark config (in a cell):
spark.conf.set("spark.mongodb.read.connection.uri", "mongodb+srv://<user>:<password>@<cluster-host>/test?retryWrites=true&w=majority")
spark.conf.set("spark.mongodb.write.connection.uri", "mongodb+srv://<user>:<password>@<cluster-host>/test?retryWrites=true&w=majority")
Replace localhost:27017 with your actual cluster host, username, and password.
2. Test with Python Driver from Notebook
Try connecting with pymongo (if available):
from pymongo import MongoClient
client = MongoClient("mongodb+srv://<user>:<password>@<cluster-host>")
print(client.server_info())
If this fails, the problem is at the network/firewall level or improper authentication.
3. Check Node Networking
-
Telnet from %sh only checks connectivity from the driver node.
-
For Spark clusters, networking must allow all worker nodes to reach the database. Workers spun up by Databricks may or may not share the same outbound IP as the driver node.
4. Review Cluster Configuration
-
Ensure you have installed the correct mongo-spark-connector library on your cluster via the Libraries tab.
-
Confirm all Spark jobs use the correct connector version (10.4.1 for MongoDB 7.x is supported).
-
Double-check any environmental variables or secret scopes for credentials.
Additional Notes
-
Using mongodb+srv:// is recommended for Atlas or DNS-enabled clusters.
-
If you use private endpoints or VNet integration, ensure Databricks has proper routing/subnet permissions.
-
If you are using auth sources or custom databases, add those parameters to your URI.
Example Connection (with Spark DataFrame Read):
df = spark.read.format("mongodb").option("uri", "mongodb+srv://<user>:<password>@<cluster-host>/<database>.<collection>").load()
df.show()
Summary:
Your issue is likely due to an incorrect URI (defaulting to localhost:27017) or network/firewall restrictions unique to the Databricks execution environment, not your laptop. Double-check your connection string in the notebook, test with a standalone Python client, and make sure all nodes have the necessary network permissions to reach MongoDB.