Brayden
0
Q:

torch distributed address already in use

#Kill zombie processes set of by torch.distributed launch
# if a small number of gpus use
$ ps -a
$ kill -9 [pid]
#for larger numbers of gpus
$ kill $(ps aux | grep YOUR_TRAINING_SCRIPT.py | grep -v grep | awk '{print $2}')
0

New to Communities?

Join the community