<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Issue with Reading MongoDB Data in Unity Catalog Cluster in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/issue-with-reading-mongodb-data-in-unity-catalog-cluster/m-p/55941#M30473</link>
    <description>&lt;P&gt;Few points&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. chce if you installed exactly same driver version as you are pointing this in code (2.12:3.2.0) it has to match 100percent&lt;/P&gt;&lt;P&gt;org.mongodb.spark:mongo-spark-connector_2.12:3.2.0&lt;/P&gt;&lt;P&gt;2. I have seen people configuring&amp;nbsp; connction to atlas in two ways&lt;/P&gt;&lt;P&gt;Option 1&lt;/P&gt;&lt;P&gt;Back in Databricks in your cluster configuration, under Advanced Options (bottom of page), paste the connection string for both the spark.mongodb.output.uri and spark.mongodb.input.uri variables. Plase populate the username and password field appropriatly. This way all the workbooks you are running on the cluster will use this configuration.&lt;/P&gt;&lt;P&gt;Option 2:&lt;/P&gt;&lt;P&gt;Alternativley you can explictly set the option when calling APIs like: spark.read.format("mongo").option("spark.mongodb.input.uri", connectionString).load(). If congigured the variables in the cluster, you don't have to set the option.&lt;/P&gt;&lt;P&gt;3. Try with single user cluster ( if you are using shared cluster) .&lt;/P&gt;&lt;P&gt;4. Check mongo db driver compatibility with spark version you are using&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope that it will help .&lt;/P&gt;</description>
    <pubDate>Fri, 29 Dec 2023 23:29:52 GMT</pubDate>
    <dc:creator>Wojciech_BUK</dc:creator>
    <dc:date>2023-12-29T23:29:52Z</dc:date>
    <item>
      <title>Issue with Reading MongoDB Data in Unity Catalog Cluster</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-reading-mongodb-data-in-unity-catalog-cluster/m-p/55929#M30465</link>
      <description>&lt;P&gt;I am encountering an issue while trying to read data from MongoDB in a Unity Catalog Cluster using PySpark. I have shared my code below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;from pyspark.sql import SparkSession

database = "cloud"
collection = "data"
Scope = "XXXXXXXX"
Key = "XXXXXX-YYYYYY-ZZZZZZ"
connectionString = dbutils.secrets.get(scope=Scope, key=Key)

spark = (
SparkSession.builder.config("spark.mongodb.input.uri", connectionString)
.config("spark.mongodb.output.uri", connectionString)
.config("spark.jars.packages", "org.mongodb.spark:mongo-spark-connector_2.12:3.2.0")
.getOrCreate()
)

# Reading from MongoDB
df = (
spark.read.format("mongo")
.option("uri", connectionString)
.option("database", database)
.option("collection", collection)
.load()
)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, I am encountering the following error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find data source: mongo. Please find packages at `https://spark.apache.org/third-party-projects.html`.&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have already included the necessary MongoDB Spark Connector package, but it seems like Spark is unable to find the data source. Can someone please help me understand what might be causing this issue and how I can resolve it? Any insights or suggestions would be greatly appreciated. Thank you!&lt;/P&gt;</description>
      <pubDate>Fri, 29 Dec 2023 15:55:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-reading-mongodb-data-in-unity-catalog-cluster/m-p/55929#M30465</guid>
      <dc:creator>naveenprasanth</dc:creator>
      <dc:date>2023-12-29T15:55:30Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with Reading MongoDB Data in Unity Catalog Cluster</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-reading-mongodb-data-in-unity-catalog-cluster/m-p/55941#M30473</link>
      <description>&lt;P&gt;Few points&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. chce if you installed exactly same driver version as you are pointing this in code (2.12:3.2.0) it has to match 100percent&lt;/P&gt;&lt;P&gt;org.mongodb.spark:mongo-spark-connector_2.12:3.2.0&lt;/P&gt;&lt;P&gt;2. I have seen people configuring&amp;nbsp; connction to atlas in two ways&lt;/P&gt;&lt;P&gt;Option 1&lt;/P&gt;&lt;P&gt;Back in Databricks in your cluster configuration, under Advanced Options (bottom of page), paste the connection string for both the spark.mongodb.output.uri and spark.mongodb.input.uri variables. Plase populate the username and password field appropriatly. This way all the workbooks you are running on the cluster will use this configuration.&lt;/P&gt;&lt;P&gt;Option 2:&lt;/P&gt;&lt;P&gt;Alternativley you can explictly set the option when calling APIs like: spark.read.format("mongo").option("spark.mongodb.input.uri", connectionString).load(). If congigured the variables in the cluster, you don't have to set the option.&lt;/P&gt;&lt;P&gt;3. Try with single user cluster ( if you are using shared cluster) .&lt;/P&gt;&lt;P&gt;4. Check mongo db driver compatibility with spark version you are using&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope that it will help .&lt;/P&gt;</description>
      <pubDate>Fri, 29 Dec 2023 23:29:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-reading-mongodb-data-in-unity-catalog-cluster/m-p/55941#M30473</guid>
      <dc:creator>Wojciech_BUK</dc:creator>
      <dc:date>2023-12-29T23:29:52Z</dc:date>
    </item>
  </channel>
</rss>

