cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

DatabricksStreamingQueryListener Stopping the stream

DE-cat
New Contributor III

I am running the following structured streaming Scala code in DB 13.3LTS job:

 

		val query = spark.readStream.format("delta")
			.option("ignoreDeletes", "true")
			.option("maxFilesPerTrigger", maxEqlPerBatch)
			.load(tblPath)
			.writeStream
			.queryName(getQueryName)
			.outputMode("append")
			.option("checkpointLocation", runtimePath + "/_checkpoint/stream.json")
			.foreachBatch(process _)
			.start()

		var lastBatchTime = Instant.now
		while (query.isActive) {
			val progress = query.lastProgress
			if (progress != null) {
				if (progress.numInputRows > 0) {
					lastBatchTime = Instant.parse(progress.timestamp)
				} else {
					if (Duration.between(lastBatchTime, Instant.parse(progress.timestamp)).getSeconds > timeoutSeconds) {
						LOGGER.info(s"Stopping Query after inactivity timeout: $timeoutSeconds")
						query.stop
					}
				}
			}
			query.awaitTermination(10000L)
		}

 

which should keep the stream job alive for timeoutSeconds after the last query is processed, but the job gets canceled around 30 minutes after start due to:
DAGScheduler: Asked to cancel job group
ScalaDriverWrapper: Stopping streams for commandId pattern
DatabricksStreamingQueryListener: Stopping the stream

Any help would be appreciated. Thx

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group