When running streaming pipelines, the key is to design for stability and isolation, not to rely on restart jobs.
The first thing to do is run your streams on Jobs Compute, not All-Purpose clusters. If available, use Serverless Jobs. Each pipeline should have its own dedicated job cluster, which ensures clean, isolated runtimes, consistent libraries, and automatic retries, all of which reduce drift and instability.
Choose a recent LTS Databricks Runtime with Photon (e.g., 14.x or 15.x). Photon gives a real boost in JSON parsing and Delta writes. Enable autoscaling with a minimum of at least two workers to prevent executors from churning, avoid min=0 for long-running streams.
You didnโt mention if youโre using Delta Live Tables (now called Declarative Pipelines), but thatโs worth exploring. DLT automatically manages cluster lifecycles, recovery, data quality checks, autoscaling, and lineage, all built in.
In short:
Run your workloads through Workflows (Jobs) using Job or Serverless clusters with retries, autoscaling floors, proper checkpoints, file-notification mode, and monitoring.
Thereโs no need for a separate optimizer job to stop and restart clusters, follow the checkpoint/notifications/small-file/state management hygiene instead. Youโll find detailed guidance in the Databricks documentation on streaming best practices and Auto Loader performance tuning.