Is there any advice on how to get a HuggingFacePredictor to run on multiple gpus? I tested on a single node with 1 vs 2 gpus and they ran at the same speed.
I’m using Facebook’s Zero Shot model in the HuggingFacePredictor.
Pytorch detects both gpus.
I build the Ray Cluster with 2 gpus.
And I set ‘num_gpus_per_worker’ to 2 in the HuggingFacePredictor when calling ‘predict’.