12-09-2025 07:20 PM
I am on Databricks Run Time LTE 14.3 Spark 3.5.0 Scala 2.12 and mongodb-spark-connector_2.12:10.2.0.
Trying to connect to Document DB using the connector and all I get is a connection timeout. I tried using PyMongo, which works as expected and I can read from the db. I have a CA_File that I'm passing as an argument and it's stored in Unity Catalog.
Code:
CONNECTION_URI = f"mongodb://{USERNAME}:{PASSWORD}@{ENDPOINT}:27017/{DATABASE_NAME}?replicaSet=rs0&readPreference=secondaryPreferred"
df = spark.read.format("mongodb") \
.option("spark.mongodb.read.connection.uri", CONNECTION_URI) \
.option("collection", COLLECTION) \
.option("ssl", "true") \
.option("ssl.CAFile", DBFS_CA_FILE) \
.option("ssl.enabledProtocols", "TLSv1.2") \
.load()Connector error:
SparkConnectGrpcException: (com.mongodb.MongoTimeoutException) Timed out after 30000 ms while waiting for a server that matches com.mongodb.client.internal.MongoClientDelegate. Client view of state is {type=REPLICA_SET, servers=[{address=<endpoint>, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message}, caused by {java.net.SocketTimeoutException: Read timed out}}]
2 weeks ago
If PyMongo works but the Spark connector times out, the issue is almost always JVM TLS configuration or executor-level network access, not credentials or the database itself.
TLS handling (most common cause):
The MongoDB Spark connector runs on the JVM and does not handle CA PEM files the same way as PyMongo. Use a JVM truststore (JKS or PKCS12) instead of ssl.CAFile, and configure it via JVM options for both driver and executors.
Executor connectivity:
PyMongo usually tests connectivity from the driver only. Spark reads from executors, so confirm that all worker nodes can reach the DocumentDB endpoint on port 27017 (security groups, routes, DNS).
Enable TLS via URI:
Set TLS explicitly in the connection string (e.g. tls=true) rather than relying on connector options.
DocumentDB compatibility:
Add retryWrites=false to the connection string to align with Amazon DocumentDB limitations.
a week ago
Hello @bianca_unifeye,
I was able to solve this issue by adding a JVM truststore. But it involved modifying the default Java cacert and appending the custom cert to the default cacert. I followed this KB article for How to import a custom CA certificate - Databricks. Thanks for the response!
a week ago
Glad I was able to help!
2 weeks ago
Hi @rijin-thomas - Can you please allow the CIDR block for databricks account VPC from aws document db sg ( Executor connectivity stated by@bianca_unifeye ) .