I’m using Ray TochTrainer() to train my model on a multi-cluster setup. I’m getting some errors and I want to set
TORCH_DISTRIBUTED_DEBUG=Info and also need to somehow pass
find_unused_parameters to Pytorch DDP to do some debugging. I was wondering if it is possible to do it when I use TorchTrainer(). If not, is there any other way that I can debug my code?