cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issue with MongoDB Spark Connector in Databricks

vidya_kothavale
New Contributor III

 

I followed the official Databricks documentation("https://docs.databricks.com/en/_extras/notebooks/source/mongodb.html")

to integrate MongoDB Atlas with Spark by setting up the MongoDB Spark Connector and configuring the connection string in my Databricks cluster. However, I am encountering issues when trying to read data from MongoDB using Spark.

While I can successfully connect to MongoDB using the MongoClient in Python and execute queries like

from pymongo import MongoClient
client = MongoClient("connectionstring")
db = client["demo"]
collection = db["demo_collection"]
print(collection.find_one())

 I am unable to load data using the Spark connector with the following code:

df = spark.read.format("mongodb") \ .option("database", database) \ .option("spark.mongodb.input.uri", connectionString) \ .option("collection", "demo_collection") \ .load() df.printSchema()

The connection string is the same in both cases, and I have confirmed that the necessary permissions and IP whitelisting are correctly configured in MongoDB Atlas.

Despite this, no data is being retrieved when using Spark, and I’m unable to identify the issue.

also, I attached error screenshot below.

Can anyone provide guidance on potential configuration issues or additional steps needed to troubleshoot this problem with the MongoDB Spark connector in Databricks?

1 ACCEPTED SOLUTION

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

Hi @vidya_kothavale ,

Could you try to change "spark.mongodb.input.uri" to following?

spark.read.format("mongodb").option("spark.mongodb.read.connection.uri"

 

View solution in original post

2 REPLIES 2

szymon_dybczak
Esteemed Contributor III

Hi @vidya_kothavale ,

Could you try to change "spark.mongodb.input.uri" to following?

spark.read.format("mongodb").option("spark.mongodb.read.connection.uri"

 

Thanks! @szymon_dybczak  It's working perfectly now.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group