IMPALA with VTrace on multi-GPU with Pytorch

sven1977 · June 29, 2021, 2:14pm

Hey @Astariul , strange. Yeah, it could have to do with your custom action distribution, which moves things back on the GPU in the dist.logp call inside multi_log_probs_from_logits_and_actions. It’s probably better to have your change in then.

As background: v-trace calculations - as per the original IMPALA paper - should be done on the CPU as these are all sequential. That’s why we do this move inside the IMPALA loss - seemingly all of a sudden - from “device” to the “cpu” (no matter what “device” is), and then back to “device” after the v-trace computations.

Topic		Replies	Views
Custom Env (PPO + Action Masking) GPU and CPU mismatch error RLlib	2	63	July 30, 2025
How do I set GPU affinity of workers RLlib	17	2573	April 23, 2021
SAC on multi-GPU with Pytorch RLlib	0	420	July 8, 2021
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_addmm) RLlib	4	2946	August 8, 2022
RLlib IMPALA multi GPU performance Configure Algorithm, Training, Evaluation, Scaling	3	634	March 19, 2023

IMPALA with VTrace on multi-GPU with Pytorch

Related topics