pradeep_singh
Contributor III

How to explain it to the client - 
The job is operating at the resource ceiling of a very small driver. Tiny, normal day‑to‑day differences (file layout, plan choice, GC timing) sometimes push it over the limit, which is why restarts occasionally “fix” it—the restart clears memory and changes runtime conditions.This is assuming no other workload/query is running on it .

As @MoJaMa suggested move to a job cluster . Upgrade to latest DBR if possible .  Periodically optimize your target table to compact files assuming its in delta format .

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev