Databricks Community

phdykd · ‎07-12-2023

This is the error I am getting :"RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method". I am using 13.0nc12s_v3 Cluster.

I used this one :"

import torch.multiprocessing as mp

mp.set_start_method('spawn', force=True)

from pytorch_lightning.callbacks import EarlyStopping

", but still getting the same issue. Any solution?

Thanks

Kumaran · ‎07-21-2023

Hi @phdykd,
Thank you for posting your question in the Databricks community.

One approach is to include the start_method="fork" parameter in the spawn function call as follows: mp.spawn(*prev_args, start_method="fork"). Although this will work, it might raise a warning suggesting to use method (option 2 below).
Another recommended solution, according to PyTorch (link), is to use torch.multiprocessing.start_processes: torch.multiprocessing.start_processes(*prev_args, start_method="fork").
It's important to note that the above options are not compatible with CUDA (link, link). Hence, attempting to run any .cuda related commands will lead to failures.
The viable solution that successfully resolves all of these issues is to utilize TorchDistributor(local_mode=True).

Please refer to this Documentation for more details

Databricks Community

Cannot re-initialize CUDA in forked subprocess.

🌟 Community Pulse: Your Weekly Roundup! July 13 – 19, 2026

Solution Accelerator Series | Social Determinants of Health

Upcoming Community BrickTalk | Sports Analytics: Turning Tracking Data into Real-Time AI Decisions

How to Optimize Your Content for GEO: Best Practices for Writing Discoverable Community Content

Databricks Community Fellows – June 2026 Recap