I'm facing this exact issue, only with a standard job instead of a DLT pipeline. I can't use serverless or restart the cluster periodically due to things out of my control. Any specific advice on diagnosis and resolving? I don't think it can be check...