topic Re: The driver is temporarily unavailable in Data Engineering

The driver is temporarily unavailable

brickster_2018 — Fri, 25 Jun 2021 18:43:48 GMT

My job fails with Driver is temporarily unavailable. Apparently, it's permanently unavailable, because the job is not pausing but failing.

Re: The driver is temporarily unavailable

brickster_2018 — Fri, 25 Jun 2021 18:47:58 GMT

The error messages mean the Spark driver is having insufficient resources. Although increasing the driver memory or using a bigger instance type is a quick workaround, identifying the issue is the key thing.

On the application side , review if there are operations which is driver intensive. Say, collect() or toPandas() etc.

Also check if the instance type used is adequate for the workload.

Re: The driver is temporarily unavailable

Chalki — Mon, 14 Aug 2023 20:10:17 GMT

I am facing the same issues . I am writing in batches using a simple for loop. I don't have any collect statements inside the loop. I am rewriting the partitions with partition overwrite dynamic mode in a huge wide delta table - several tb. The incremental load is sometimes 1-2 tb