- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-19-2024 02:52 AM
Hi ,
We are trying to read data from mongodb using databricks notebook with pyspark connectivity.
When we try to display data frame data using show or display method , it gives error "org.bson.BsonInvalidOperationException:Document does not contain key count"
Data in mongo collection is in timeseries (struct) format .
connectionString='mongodb+srv://CONNECTION_STRING_HERE/
database="sample_supplies"
collection="sales"
salesDF = spark.read.format("mongo").option("database", database).option("collection", collection).option("spark.mongodb.input.uri", connectionString).load()
display(salesDF)
"org.bson.BsonInvalidOperationException:Document does not contain key count"
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2024 08:17 AM
Thanks, @Retired_mod for your input. I had the same problem and couldn't display the dataframe and I had only mongo-spark-connector installed on my cluster (DBR 14.3 LTS Spark 3.5.0 and Scala 2.12). After I installed the rest of the suggested JAR files it still failed, but after I changed DBR to 13.3 LTS Spark 3.4.1 and Scala 2.12 it worked.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2024 09:20 AM
UPDATE:
Installing mongo-spark-connector_2.12-10.3.0-all.jar from Maven does NOT require the JAR files below to be installed on the cluster to display the dataframe
- bson
- mongodb-driver-core
- mongodb-driver-sync
Also, I noticed that both DBR 13.3 LTS and 14.3 LTS work fine with this specific spark connector JAR file installed on the cluster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-19-2024 06:08 AM
Hi @Retired_mod , I tried all above steps, still didn't work. Parallelly checking with Mongo team.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2024 08:17 AM
Thanks, @Retired_mod for your input. I had the same problem and couldn't display the dataframe and I had only mongo-spark-connector installed on my cluster (DBR 14.3 LTS Spark 3.5.0 and Scala 2.12). After I installed the rest of the suggested JAR files it still failed, but after I changed DBR to 13.3 LTS Spark 3.4.1 and Scala 2.12 it worked.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2024 09:20 AM
UPDATE:
Installing mongo-spark-connector_2.12-10.3.0-all.jar from Maven does NOT require the JAR files below to be installed on the cluster to display the dataframe
- bson
- mongodb-driver-core
- mongodb-driver-sync
Also, I noticed that both DBR 13.3 LTS and 14.3 LTS work fine with this specific spark connector JAR file installed on the cluster.

