<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hello, I&amp;#39;m trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue. I boiled the job down it i... in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/hello-i-39-m-trying-to-use-databricks-on-azure-with-a-spark/m-p/23648#M16367</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I boiled the job down it it's basics for testing, reading from a Kafka topic and writing to console in a forEachBatch.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;On local, everything works fine indefinately.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;On Databricks, the task terminates after just over 5 minutes with a "Cancelled" status.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There are no errors in the log, just this, which appears to be a graceful shutdown request of some kind, but I don't know where it's coming from&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;22/11/04 18:31:30 INFO DriverCorral$: Cleaning the wrapper ReplId-1ea30-8e4c0-48422-a (currently in status Running(ReplId-1ea30-8e4c0-48422-a,ExecutionId(job-774316032912321-run-84401-action-5645198327600153),RunnableCommandId(9102993760433650959)))
22/11/04 18:31:30 INFO DAGScheduler: Asked to cancel job group 2207618020913201706_9102993760433650959_job-774316032912321-run-84401-action-5645198327600153
22/11/04 18:31:30 INFO ScalaDriverLocal: cancelled jobGroup:2207618020913201706_9102993760433650959_job-774316032912321-run-84401-action-5645198327600153 
22/11/04 18:31:30 INFO ScalaDriverWrapper: Stopping streams for commandId pattern: CommandIdPattern(2207618020913201706,None,Some(job-774316032912321-run-84401-action-5645198327600153)).
22/11/04 18:31:30 INFO DatabricksStreamingQueryListener: Stopping the stream [id=d41eff2a-4de6-4f17-8d1c-659d1c1b8d98, runId=5bae9fb4-b5e1-45a0-af1e-a2f2553592c9]
22/11/04 18:31:30 INFO DAGScheduler: Asked to cancel job group 5bae9fb4-b5e1-45a0-af1e-a2f2553592c9
22/11/04 18:31:30 INFO TaskSchedulerImpl: Cancelling stage 366
22/11/04 18:31:30 INFO TaskSchedulerImpl: Killing all running tasks in stage 366: Stage cancelled
22/11/04 18:31:30 INFO MicroBatchExecution: QueryExecutionThread.interruptAndAwaitExecutionThreadTermination called with streaming query exit timeout=15000 ms&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Any thoughts?&lt;/P&gt;</description>
    <pubDate>Fri, 04 Nov 2022 19:54:56 GMT</pubDate>
    <dc:creator>JesseLancaster</dc:creator>
    <dc:date>2022-11-04T19:54:56Z</dc:date>
    <item>
      <title>Hello, I'm trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue. I boiled the job down it i...</title>
      <link>https://community.databricks.com/t5/data-engineering/hello-i-39-m-trying-to-use-databricks-on-azure-with-a-spark/m-p/23648#M16367</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I boiled the job down it it's basics for testing, reading from a Kafka topic and writing to console in a forEachBatch.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;On local, everything works fine indefinately.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;On Databricks, the task terminates after just over 5 minutes with a "Cancelled" status.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There are no errors in the log, just this, which appears to be a graceful shutdown request of some kind, but I don't know where it's coming from&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;22/11/04 18:31:30 INFO DriverCorral$: Cleaning the wrapper ReplId-1ea30-8e4c0-48422-a (currently in status Running(ReplId-1ea30-8e4c0-48422-a,ExecutionId(job-774316032912321-run-84401-action-5645198327600153),RunnableCommandId(9102993760433650959)))
22/11/04 18:31:30 INFO DAGScheduler: Asked to cancel job group 2207618020913201706_9102993760433650959_job-774316032912321-run-84401-action-5645198327600153
22/11/04 18:31:30 INFO ScalaDriverLocal: cancelled jobGroup:2207618020913201706_9102993760433650959_job-774316032912321-run-84401-action-5645198327600153 
22/11/04 18:31:30 INFO ScalaDriverWrapper: Stopping streams for commandId pattern: CommandIdPattern(2207618020913201706,None,Some(job-774316032912321-run-84401-action-5645198327600153)).
22/11/04 18:31:30 INFO DatabricksStreamingQueryListener: Stopping the stream [id=d41eff2a-4de6-4f17-8d1c-659d1c1b8d98, runId=5bae9fb4-b5e1-45a0-af1e-a2f2553592c9]
22/11/04 18:31:30 INFO DAGScheduler: Asked to cancel job group 5bae9fb4-b5e1-45a0-af1e-a2f2553592c9
22/11/04 18:31:30 INFO TaskSchedulerImpl: Cancelling stage 366
22/11/04 18:31:30 INFO TaskSchedulerImpl: Killing all running tasks in stage 366: Stage cancelled
22/11/04 18:31:30 INFO MicroBatchExecution: QueryExecutionThread.interruptAndAwaitExecutionThreadTermination called with streaming query exit timeout=15000 ms&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Any thoughts?&lt;/P&gt;</description>
      <pubDate>Fri, 04 Nov 2022 19:54:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hello-i-39-m-trying-to-use-databricks-on-azure-with-a-spark/m-p/23648#M16367</guid>
      <dc:creator>JesseLancaster</dc:creator>
      <dc:date>2022-11-04T19:54:56Z</dc:date>
    </item>
    <item>
      <title>Re: Hello, I'm trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue. I boiled the job down it i...</title>
      <link>https://community.databricks.com/t5/data-engineering/hello-i-39-m-trying-to-use-databricks-on-azure-with-a-spark/m-p/23652#M16371</link>
      <description>&lt;P&gt;Kaniz,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Unfortunately that information is not useful.&lt;/P&gt;&lt;P&gt;1). I'm familiar with structured streaming and checkpoints, I've developed with spark for many years, just not on Databricks&lt;/P&gt;&lt;P&gt;2) This doesn't address the reason for the failure, a streaming job should run without interruption and not have to be restarted every 5 minutes&lt;/P&gt;&lt;P&gt;3) I tried setting up a retry policy, however it doesn't trigger (presumably because it's a cancellation according to the status not a failure) so even if I wanted to just restart the job every 5 minutes with a retry policy I cannot.&lt;/P&gt;</description>
      <pubDate>Wed, 09 Nov 2022 17:16:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hello-i-39-m-trying-to-use-databricks-on-azure-with-a-spark/m-p/23652#M16371</guid>
      <dc:creator>JesseLancaster</dc:creator>
      <dc:date>2022-11-09T17:16:49Z</dc:date>
    </item>
    <item>
      <title>Re: Hello, I'm trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue. I boiled the job down it i...</title>
      <link>https://community.databricks.com/t5/data-engineering/hello-i-39-m-trying-to-use-databricks-on-azure-with-a-spark/m-p/23654#M16373</link>
      <description>&lt;P&gt;Scala, Spark with EventHubs via Kafka interface&lt;/P&gt;</description>
      <pubDate>Wed, 09 Nov 2022 18:05:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hello-i-39-m-trying-to-use-databricks-on-azure-with-a-spark/m-p/23654#M16373</guid>
      <dc:creator>JesseLancaster</dc:creator>
      <dc:date>2022-11-09T18:05:34Z</dc:date>
    </item>
  </channel>
</rss>

