cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Hello, I'm trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue. I boiled the job down it i...

JesseLancaster
New Contributor III

Hello,

I'm trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue.

I boiled the job down it it's basics for testing, reading from a Kafka topic and writing to console in a forEachBatch.

On local, everything works fine indefinately.

On Databricks, the task terminates after just over 5 minutes with a "Cancelled" status.

There are no errors in the log, just this, which appears to be a graceful shutdown request of some kind, but I don't know where it's coming from

22/11/04 18:31:30 INFO DriverCorral$: Cleaning the wrapper ReplId-1ea30-8e4c0-48422-a (currently in status Running(ReplId-1ea30-8e4c0-48422-a,ExecutionId(job-774316032912321-run-84401-action-5645198327600153),RunnableCommandId(9102993760433650959)))
22/11/04 18:31:30 INFO DAGScheduler: Asked to cancel job group 2207618020913201706_9102993760433650959_job-774316032912321-run-84401-action-5645198327600153
22/11/04 18:31:30 INFO ScalaDriverLocal: cancelled jobGroup:2207618020913201706_9102993760433650959_job-774316032912321-run-84401-action-5645198327600153 
22/11/04 18:31:30 INFO ScalaDriverWrapper: Stopping streams for commandId pattern: CommandIdPattern(2207618020913201706,None,Some(job-774316032912321-run-84401-action-5645198327600153)).
22/11/04 18:31:30 INFO DatabricksStreamingQueryListener: Stopping the stream [id=d41eff2a-4de6-4f17-8d1c-659d1c1b8d98, runId=5bae9fb4-b5e1-45a0-af1e-a2f2553592c9]
22/11/04 18:31:30 INFO DAGScheduler: Asked to cancel job group 5bae9fb4-b5e1-45a0-af1e-a2f2553592c9
22/11/04 18:31:30 INFO TaskSchedulerImpl: Cancelling stage 366
22/11/04 18:31:30 INFO TaskSchedulerImpl: Killing all running tasks in stage 366: Stage cancelled
22/11/04 18:31:30 INFO MicroBatchExecution: QueryExecutionThread.interruptAndAwaitExecutionThreadTermination called with streaming query exit timeout=15000 ms

Any thoughts?

2 REPLIES 2

Kaniz,

Unfortunately that information is not useful.

1). I'm familiar with structured streaming and checkpoints, I've developed with spark for many years, just not on Databricks

2) This doesn't address the reason for the failure, a streaming job should run without interruption and not have to be restarted every 5 minutes

3) I tried setting up a retry policy, however it doesn't trigger (presumably because it's a cancellation according to the status not a failure) so even if I wanted to just restart the job every 5 minutes with a retry policy I cannot.

Scala, Spark with EventHubs via Kafka interface

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group