cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Connection timeout when connecting to MongoDB using MongoDB Connector for Spark 10.x

RobsonNLPT
Contributor III

Hi.

I'm testing a databricks connection to a mongo cluster V7 (azure cluster) using the library org.mongodb.spark:mongo-spark-connector_2.13:10.4.1

I can connect using compass but I get a timeout error using my adb notebook

MongoTimeoutException: Timed out while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused}}]

By the way I telnet the server with success (%sh telnet.....)

Any ideas?

4 REPLIES 4

RobsonNLPT
Contributor III

any help?

Kirki
New Contributor II

Hi. Not a solution I'm afraid, but I'm having the exact same issue. Did you manage to resolve at all? 

What is throwing me is that I'm configuring the IP for the MongoDB instance as its running in AWS on an EC2 instance, but I still see 'localhost' on the error message as if it's ignoring my configuration. Is this similar to what you're seeing?

Hi.

 

Yes . Same. I see localhost.

My cluster is deployed on azure kubernetes. I can connect using pymongo and also compass.

I've tested using a free atlas cluster and worked as well (I changed the atlas firewall rule to enabled my databricks workspace)

No clues

 

 

mark_ott
Databricks Employee
Databricks Employee

The error you’re seeing — MongoTimeoutException referencing localhost:27017 — suggests your Databricks cluster is trying to connect to MongoDB using the wrong address or that it cannot properly reach the MongoDB cluster endpoint from the notebook, even though telnet works from a shell command.

Immediate Issues and Solutions

  • Wrong Host in Connection String:
    The error log shows localhost:27017, which is almost always incorrect when connecting from Databricks to a cloud MongoDB cluster. The connection string in your Spark configuration or notebook is likely defaulting to localhost, which refers to the Databricks node, not your MongoDB cluster. Compass might connect because it’s running from your local machine, where you’ve specified the correct MongoDB URI.

  • Network Connectivity:

    • Telnet confirms the network route, but Spark jobs run on worker nodes, which might have different networking rules. Also, running %sh telnet uses the driver, not the Spark executors, so it is not a definitive test for all nodes in the cluster.

  • Firewall/Security Groups:

    • Even if telnet works from the driver, your MongoDB Atlas or Azure firewall may be blocking traffic from Databricks worker pools. Double-check your IP allowlist or VNet/NSG rules for MongoDB.

Troubleshooting Steps

1. Check and Correct the Connection String

Make sure your Spark configuration uses the full MongoDB URI, not localhost. Example Spark config (in a cell):

python
spark.conf.set("spark.mongodb.read.connection.uri", "mongodb+srv://<user>:<password>@<cluster-host>/test?retryWrites=true&w=majority") spark.conf.set("spark.mongodb.write.connection.uri", "mongodb+srv://<user>:<password>@<cluster-host>/test?retryWrites=true&w=majority")

Replace localhost:27017 with your actual cluster host, username, and password.

2. Test with Python Driver from Notebook

Try connecting with pymongo (if available):

python
from pymongo import MongoClient client = MongoClient("mongodb+srv://<user>:<password>@<cluster-host>") print(client.server_info())

If this fails, the problem is at the network/firewall level or improper authentication.

3. Check Node Networking

  • Telnet from %sh only checks connectivity from the driver node.

  • For Spark clusters, networking must allow all worker nodes to reach the database. Workers spun up by Databricks may or may not share the same outbound IP as the driver node.

4. Review Cluster Configuration

  • Ensure you have installed the correct mongo-spark-connector library on your cluster via the Libraries tab.

  • Confirm all Spark jobs use the correct connector version (10.4.1 for MongoDB 7.x is supported).

  • Double-check any environmental variables or secret scopes for credentials.

Additional Notes

  • Using mongodb+srv:// is recommended for Atlas or DNS-enabled clusters.

  • If you use private endpoints or VNet integration, ensure Databricks has proper routing/subnet permissions.

  • If you are using auth sources or custom databases, add those parameters to your URI.

Example Connection (with Spark DataFrame Read):

python
df = spark.read.format("mongodb").option("uri", "mongodb+srv://<user>:<password>@<cluster-host>/<database>.<collection>").load() df.show()

Summary:
Your issue is likely due to an incorrect URI (defaulting to localhost:27017) or network/firewall restrictions unique to the Databricks execution environment, not your laptop. Double-check your connection string in the notebook, test with a standalone Python client, and make sure all nodes have the necessary network permissions to reach MongoDB.