Streaming job update

thibault — Fri, 06 Sep 2024 12:41:08 GMT

Hi!

Using bundles, I want to update a running streaming job. All good until the new job gets deployed, but then the job needs to be stopped manually so that the new assets are used and it has to be started manually. This might lead to the job running an old version if the job is not stopped & started again manually.

How do you typically handle updates to streaming jobs automatically?

Re: Streaming job update

mark_ott — Tue, 04 Nov 2025 12:51:24 GMT

To handle updates to streaming jobs automatically and ensure that new code or assets are picked up without requiring manual stops and restarts, you typically use one of the following approaches depending on your streaming framework and deployment environment:

Best Practice Approaches

Parallel Pipeline Deployment: Some managed platforms (like Google Dataflow) support "parallel pipeline updates," where a new version of the job is spun up in parallel with the old one, and the old job is drained after a set duration. This approach minimizes downtime and reduces manual steps, although it can temporarily duplicate data processing if not carefully managed. The new job must have a different name, and downstream consumers must handle duplicate or partial data that may result during the switchover.
Draining and Restart Automation: Where in-place updating or parallel replacement is not supported, automate the drain, stop, and start steps by using CI/CD automation or orchestrators (like Airflow, Jenkins, or built-in scheduler APIs of your cloud provider or streaming engine). These automation scripts or workflows can ensure that the current job is stopped safely after or while a new one is deployed, then started immediately, minimizing human error and latency.
Stateful Streaming Upgrades: Frameworks such as Apache Flink, Kafka Streams, and Spark Structured Streaming generally require stopping the existing pipeline and starting a new one with the updated assets. For zero-downtime, this process can be scripted. Some frameworks support "savepoints" or checkpoints that can be taken before shutdown, and then restored with the new job, limiting data loss or downtime.
In-flight Updates (where available): Some frameworks/platforms offer in-flight or rolling updates for streaming jobs, especially when only configuration or resource values are changed (not code or dependencies). For example, auto-scaling or light config updates may be safely applied on a running job, but code or asset changes usually require job restart.

Tools and Automation Suggestions

Use CI/CD pipelines to automate deployment, draining, stopping, and starting of updated stream jobs.
Leverage job orchestration platforms with dependency/trigger management.
Where available, use cloud service APIs for jobs (such as Dataflow’s parallel updates or AWS Glue Streaming Job update APIs) to script the update process.
Always ensure consumers and downstream systems are designed to handle duplicates or short gaps during transition windows.

Additional Considerations

Be aware of data processing guarantees and possible duplicate/partial data during parallel runs or quick restarts, and plan your sinks/outputs accordingly (idempotent writes or deduplication logic).
Monitor lag, throughput, and state hydration to ensure the post-update service resumes smoothly.
For frameworks not supporting direct in-place updates, consider implementing blue/green deployment patterns for pipelines.

In summary, you should automate the deployment and (if needed) the stop/start or drain/restart phases as much as possible and use any available managed features for rolling or parallel updates, to avoid manual intervention and reduce risk of running outdated code.

topic Re: Streaming job update in Administration & Architecture

Streaming job update

Re: Streaming job update

Best Practice Approaches

Tools and Automation Suggestions

Additional Considerations