<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Streaming data to CosmosDB in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/streaming-data-to-cosmosdb/m-p/18093#M11960</link>
    <description>&lt;P&gt;Problem solved!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Instead of trying to do everything directly with the .writeStream options I used the .forEachBatch() function which allows me to call a function outside the .writeStream().&lt;/P&gt;&lt;P&gt;In this function I get a dataFrame in parameter which is my stream dataFrame and I save it directly in CosmosDB with the save() function and my configuration.&lt;/P&gt;</description>
    <pubDate>Thu, 09 Jun 2022 16:14:30 GMT</pubDate>
    <dc:creator>Antoine_De_A</dc:creator>
    <dc:date>2022-06-09T16:14:30Z</dc:date>
    <item>
      <title>Streaming data to CosmosDB</title>
      <link>https://community.databricks.com/t5/data-engineering/streaming-data-to-cosmosdb/m-p/18092#M11959</link>
      <description>&lt;P&gt;Hello everyone,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here is the problem I am facing. I'm currently working on streaming data to DataBricks, my goal is to create a data stream on a first notebook, and then on a second notebook to read this data stream, add all the new rows to a dataFrame and finally write the rows as it happens on my CosmosDB instance.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For the first notebook no problem, I add rows in a temporary file in DBFS (&lt;B&gt;annex1&lt;/B&gt;).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;On my second notebook a readStream() function allows me to retrieve in real time&amp;nbsp;the different lines that are added to my file and then go and add them to a dataFrame called stream, this part is also functional (&lt;B&gt;annex2&lt;/B&gt;).&lt;/P&gt;&lt;P&gt;My problem is on the second function of this notebook, I manage to save the lines which are added progressively on my dataframe in a temporary file, but I would like to save these new lines on a CosmosDB instance which runs with a CoreSQL API (&lt;B&gt;annex3&lt;/B&gt;). But I can't write writeStream() correctly. Do you have any idea why it doesn't work? (&lt;B&gt;annex4&lt;/B&gt;)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note that I can listen to all the actions that are done on a CosmosDB container on another notebook, so I'm sure it's possible to insert data (&lt;B&gt;annex5&lt;/B&gt;).&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jun 2022 13:47:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/streaming-data-to-cosmosdb/m-p/18092#M11959</guid>
      <dc:creator>Antoine_De_A</dc:creator>
      <dc:date>2022-06-09T13:47:26Z</dc:date>
    </item>
    <item>
      <title>Re: Streaming data to CosmosDB</title>
      <link>https://community.databricks.com/t5/data-engineering/streaming-data-to-cosmosdb/m-p/18093#M11960</link>
      <description>&lt;P&gt;Problem solved!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Instead of trying to do everything directly with the .writeStream options I used the .forEachBatch() function which allows me to call a function outside the .writeStream().&lt;/P&gt;&lt;P&gt;In this function I get a dataFrame in parameter which is my stream dataFrame and I save it directly in CosmosDB with the save() function and my configuration.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jun 2022 16:14:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/streaming-data-to-cosmosdb/m-p/18093#M11960</guid>
      <dc:creator>Antoine_De_A</dc:creator>
      <dc:date>2022-06-09T16:14:30Z</dc:date>
    </item>
  </channel>
</rss>

