You’re seeing (a monotonic / stair‑step climb in driver RAM over thousands of DEEP CLONE statements) is a very common pattern when the driver is not “holding data”, but holding metadata, query artifacts, and per‑command state that accumulates faster than the JVM can reclaim.
Even a “pure SQL DDL/DML” workload can bloat the driver because the driver is the control plane for:
parsing/analysis and query planning
tracking Spark SQL executions
holding session/catalog metadata
tracking job/stage/task events + SQL UI state
caching file indexes / transaction log snapshots
retaining objects due to references (listeners, accumulators, progress reporters)
Reduce concurrency
Deep clones are heavy metadata + file movement operations. Running “too many” in parallel may overload the driver and also S3/API listing limits.
Try:
cut ForEach concurrency by 50–80%
aim for a concurrency level where driver memory flattens instead of rising
You’ll often get better overall throughput because you avoid driver GC thrash and retries.
Prefer SHALLOW CLONE + async copy when it meets your requirements
If your use case is “replicate table structure quickly” and you don’t immediately need independent data copies, shallow clone is dramatically cheaper, since it copies metadata and references source files rather than copying all data.
Then later:
convert to deep copy only for the subset that truly needs it, or
use other replication patterns.
If doing deep clones for migration, consider alternative approaches
Depending on why you’re cloning (migration, environment refresh, etc.), alternatives can reduce driver overhead:
CTAS/CREATE TABLE AS SELECT (heavier compute but sometimes more stable)
incremental copy strategies (if source is changing)
external replication tools/workflows