Spark connector to mongodb - mongo-spark-connector_2.12:10.1.1

I´ve added a library to the cluster and it appears in SPARK UI as Added By User

spark:// By User

I'm trying to connect using the following SparkSession configuration, but it is not working:

spark = (SparkSession.builder.config('spark.mongodb.input.uri',connectionString).config('spark.jars.packages', 'org.mongodb.spark:mongo-spark-connector_2.12:10.1.1').getOrCreate())

If I uninstall this library and install the previous one the , 2.12:3.0.1, the conection works.

Does anyone can help me with that?



It looks like you are trying to connect to MongoDB using the mongo-spark-connector_2.12:10.1.1 library, but you are facing issues with the connection. Here are a few things you can try to resolve the issue:

  1. Double-check the connection string: Make sure that the connection string you are using is correct and has the right format. You can verify this by connecting to MongoDB directly using the mongo shell or a MongoDB client.
  2. Check the Spark logs: Look for any error messages in the Spark logs. This might give you some clues about the issue. You can access the logs from the Spark UI or by running the following command:
$SPARK_HOME/bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 4g --executor-memory 2g --executor-cores 1 --num-executors 2 --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs:///spark-history --conf spark.history.fs.logDirectory=hdfs:///spark-history --jars /path/to/mongo-spark-connector_2.12-10.1.1.jar /path/to/your/application.jar

3) Try a different version of the library: If the above two steps don't work, you can try using a different version of the mongo-spark-connector library. You can find the list of available versions here:

4) Check compatibility with MongoDB server version: Make sure that the version of mongo-spark-connector library you are using is compatible with the version of MongoDB server you are using. You can check the compatibility matrix here:

I hope these suggestions help you resolve the issue.

In version 10.x of MongoDB Spark Connector some configuration options have changed.

Now you have to pass instead of spark.mongodb.input.uri.

Checkout the new other options in Read Configuration Options — MongoDB Spark Connector.

I face similar problem with anything above org.mongodb.spark:mongo-spark-connector_2.12:3.0.1
So version 10+ of org.mongodb.spark:mongo-spark-connector_2.12 from are not working with Databricks 12.2 LTS 😣

org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find data source: mongo. Please find packages at ``.

Is there anything that should be done additionally to installing it into Library? Maybe some additional cluster option?

@DmytroSokhach  I think it works if you change mongo to mongodb in the options. and use instead of spark.mongodb.input.uri as @silvadev suggested.

