<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to write ObjectId value using Spark connector 10.2.2 in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-write-objectid-value-using-spark-connector-10-2-2/m-p/138571#M50968</link>
    <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;To write an ObjectId value using Spark Mongo Connector 10.2.2 in PySpark while updating records, you must convert the ObjectId string into a special format. The Spark Mongo Connector does not automatically recognize a string as an ObjectId; it will otherwise store it as a string in MongoDB rather than the expected BSON ObjectId type.&lt;/P&gt;
&lt;H2 id="required-technique" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Required Technique&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Format the ObjectId value using a JSON structure:&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;{"$oid": "&amp;lt;hex string here&amp;gt;"}&lt;/CODE&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;When creating or updating your DataFrame, convert the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;_id&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;field (or any ObjectId field) into this format.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Set the Spark option:&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.config("spark.mongodb.write.convertJson", "object_Or_Array_Only")&lt;/CODE&gt;&lt;/P&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This enables the connector to convert the JSON structure to a BSON ObjectId when writing to MongoDB.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="pyspark-example" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;PySpark Example&lt;/H2&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[100px]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-[3px] font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;&lt;SPAN class="token token"&gt;from&lt;/SPAN&gt; pyspark&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;sql&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;functions &lt;SPAN class="token token"&gt;import&lt;/SPAN&gt; col&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; struct&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; lit

&lt;SPAN class="token token"&gt;# Example with existing DataFrame `df` with '_id' field as string&lt;/SPAN&gt;
df &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;withColumn&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"_id"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; struct&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;lit&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"$oid"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;alias&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"oid"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; col&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"_id"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

&lt;SPAN class="token token"&gt;# Writing with configuration for ObjectId conversion&lt;/SPAN&gt;
df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;write \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;format&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"mongodb"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;option&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"uri"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"mongodb://host:port/database.collection"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;option&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"spark.mongodb.write.convertJson"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"object_Or_Array_Only"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;mode&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"append"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;save&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Ensure that each ObjectId column is structured as&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;{"$oid": "hexid"}&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;before writing.​&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="mongodb-connector-version-spark-and-scala" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;MongoDB Connector Version, Spark, and Scala&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Compatible with Spark 3.2.4, Scala 2.13, and mongo-spark-connector_2.12-10.2.2.​&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="key-details" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Key Details&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Direct string ObjectIds will be written as strings; you must use the JSON struct format above for true ObjectId behavior in MongoDB.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This method is necessary for both insertion and update operations.​&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This approach should allow you to maintain the ObjectId type as required by MongoDB in your updates and inserts usingg PySpark.&lt;/P&gt;</description>
    <pubDate>Tue, 11 Nov 2025 10:54:49 GMT</pubDate>
    <dc:creator>mark_ott</dc:creator>
    <dc:date>2025-11-11T10:54:49Z</dc:date>
    <item>
      <title>How to write ObjectId value using Spark connector 10.2.2</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-write-objectid-value-using-spark-connector-10-2-2/m-p/125041#M47329</link>
      <description>&lt;P&gt;In pySpark mongo connector while updating records how to handle _id as objectId.&lt;BR /&gt;&lt;BR /&gt;spark 3.2.4&lt;BR /&gt;scala2.13&lt;BR /&gt;sparkMongoConnector 2.12-10.2.2&lt;/P&gt;</description>
      <pubDate>Sun, 13 Jul 2025 13:46:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-write-objectid-value-using-spark-connector-10-2-2/m-p/125041#M47329</guid>
      <dc:creator>ask005</dc:creator>
      <dc:date>2025-07-13T13:46:10Z</dc:date>
    </item>
    <item>
      <title>Re: How to write ObjectId value using Spark connector 10.2.2</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-write-objectid-value-using-spark-connector-10-2-2/m-p/138571#M50968</link>
      <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;To write an ObjectId value using Spark Mongo Connector 10.2.2 in PySpark while updating records, you must convert the ObjectId string into a special format. The Spark Mongo Connector does not automatically recognize a string as an ObjectId; it will otherwise store it as a string in MongoDB rather than the expected BSON ObjectId type.&lt;/P&gt;
&lt;H2 id="required-technique" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Required Technique&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Format the ObjectId value using a JSON structure:&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;{"$oid": "&amp;lt;hex string here&amp;gt;"}&lt;/CODE&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;When creating or updating your DataFrame, convert the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;_id&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;field (or any ObjectId field) into this format.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Set the Spark option:&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.config("spark.mongodb.write.convertJson", "object_Or_Array_Only")&lt;/CODE&gt;&lt;/P&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This enables the connector to convert the JSON structure to a BSON ObjectId when writing to MongoDB.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="pyspark-example" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;PySpark Example&lt;/H2&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[100px]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-[3px] font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;&lt;SPAN class="token token"&gt;from&lt;/SPAN&gt; pyspark&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;sql&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;functions &lt;SPAN class="token token"&gt;import&lt;/SPAN&gt; col&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; struct&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; lit

&lt;SPAN class="token token"&gt;# Example with existing DataFrame `df` with '_id' field as string&lt;/SPAN&gt;
df &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;withColumn&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"_id"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; struct&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;lit&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"$oid"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;alias&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"oid"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; col&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"_id"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

&lt;SPAN class="token token"&gt;# Writing with configuration for ObjectId conversion&lt;/SPAN&gt;
df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;write \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;format&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"mongodb"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;option&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"uri"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"mongodb://host:port/database.collection"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;option&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"spark.mongodb.write.convertJson"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"object_Or_Array_Only"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;mode&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"append"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
  &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;save&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Ensure that each ObjectId column is structured as&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;{"$oid": "hexid"}&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;before writing.​&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="mongodb-connector-version-spark-and-scala" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;MongoDB Connector Version, Spark, and Scala&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Compatible with Spark 3.2.4, Scala 2.13, and mongo-spark-connector_2.12-10.2.2.​&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="key-details" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Key Details&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Direct string ObjectIds will be written as strings; you must use the JSON struct format above for true ObjectId behavior in MongoDB.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This method is necessary for both insertion and update operations.​&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This approach should allow you to maintain the ObjectId type as required by MongoDB in your updates and inserts usingg PySpark.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Nov 2025 10:54:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-write-objectid-value-using-spark-connector-10-2-2/m-p/138571#M50968</guid>
      <dc:creator>mark_ott</dc:creator>
      <dc:date>2025-11-11T10:54:49Z</dc:date>
    </item>
  </channel>
</rss>

