<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: When should I use &amp;quot;.start()&amp;quot; with writeStream? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/when-should-i-use-quot-start-quot-with-writestream/m-p/26407#M18467</link>
    <description>&lt;P&gt;Thanks for your message. I am still looking for the answer. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 27 Nov 2022 20:39:09 GMT</pubDate>
    <dc:creator>Mado</dc:creator>
    <dc:date>2022-11-27T20:39:09Z</dc:date>
    <item>
      <title>When should I use ".start()" with writeStream?</title>
      <link>https://community.databricks.com/t5/data-engineering/when-should-i-use-quot-start-quot-with-writestream/m-p/26405#M18465</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am practicing with Databricks. In sample notebooks,I have seen different use of writeStream with or without ".start()" method. Samples are below:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;Without .start()&lt;/B&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;  spark.readStream
&amp;nbsp;
         .format("cloudFiles")
&amp;nbsp;
         .option("cloudFiles.format", source_format)
&amp;nbsp;
         .option("cloudFiles.schemaLocation", checkpoint_directory)
&amp;nbsp;
         .load(data_source)
&amp;nbsp;
         .writeStream
&amp;nbsp;
         .option("checkpointLocation", checkpoint_directory)
&amp;nbsp;
         .option("mergeSchema", "true")
&amp;nbsp;
         .table(table_name)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;With .start()&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;(myDF
&amp;nbsp;
 .writeStream
&amp;nbsp;
 .format("delta")
&amp;nbsp;
 .option("checkpointLocation", checkpointPath)
&amp;nbsp;
 .outputMode("append")
&amp;nbsp;
 .start(path)
&amp;nbsp;
)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;With .start()&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;query = (streaming_df.writeStream
                         .foreachBatch(streaming_merge.upsert_to_delta)
                         .outputMode("update")
                         .option("checkpointLocation", f"{DA.paths.checkpoints}/recordings")
                         .trigger(availableNow=True)
                         .start())
query.awaitTermination()
&amp;nbsp;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1) I didn't understand where should / shouldn't use ".start()" method. I appreciate it if you could guide me on this.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2) If I don't pass "path" to the "start()", where the data files will be written?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for your help. &lt;/P&gt;</description>
      <pubDate>Thu, 20 Oct 2022 07:44:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/when-should-i-use-quot-start-quot-with-writestream/m-p/26405#M18465</guid>
      <dc:creator>Mado</dc:creator>
      <dc:date>2022-10-20T07:44:25Z</dc:date>
    </item>
    <item>
      <title>Re: When should I use ".start()" with writeStream?</title>
      <link>https://community.databricks.com/t5/data-engineering/when-should-i-use-quot-start-quot-with-writestream/m-p/26406#M18466</link>
      <description>&lt;P&gt;Hi @Mohammad Saber​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Great to meet you, and thanks for your question!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; Thanks&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 27 Nov 2022 13:46:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/when-should-i-use-quot-start-quot-with-writestream/m-p/26406#M18466</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-11-27T13:46:57Z</dc:date>
    </item>
    <item>
      <title>Re: When should I use ".start()" with writeStream?</title>
      <link>https://community.databricks.com/t5/data-engineering/when-should-i-use-quot-start-quot-with-writestream/m-p/26407#M18467</link>
      <description>&lt;P&gt;Thanks for your message. I am still looking for the answer. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 27 Nov 2022 20:39:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/when-should-i-use-quot-start-quot-with-writestream/m-p/26407#M18467</guid>
      <dc:creator>Mado</dc:creator>
      <dc:date>2022-11-27T20:39:09Z</dc:date>
    </item>
  </channel>
</rss>

