Does rllib support multi-gpu plus multi-cpu training?

If I have two gpu machines and one CPU machine, how can I use the gpu of two nodes and the cpu of three nodes?
If ppo is used, only one gpu can be used; If use ddppo, the CPU node is not used.

We support Single-Node-Multi-GPU training.
But today, RLlib does not support Multi-GPU-Multi-Node training. We are working on this though.

You can generally make use of heterogeneous clusters with RLlib. For example, you can use one node with a GPU for training and another node with a GPU for sampling.
The different algorithms impose different restrictions here.
For example, as you said, DDPPO will use only GPU nodes and other algorithms will learn on the head node but use all other nodes only for sampling.

In order to get around this, you’ll have to put some effort in and heavily modify the algorithm that you are looking at.

Hi,
I am currently working on using SAC for my work and have a GPU node with 4 GPUs in it. But, when I run my code (below), it does not seem to scale.

Could you please tell why?

ray.init(num_gpus=4)
config = SACConfig().training(gamma=0.9, lr=0.01, train_batch_size=64)

config = config.resources(
  num_gpus_per_learner_worker=1, 
  num_learner_workers=torch.cuda.device_count())

# config = config.resources(num_gpus=torch.cuda.device_count())
config = config.rollouts(num_rollout_workers=100)

# Build a Algorithm object from the config and run 1 training iteration.
algo = config.build(env= MineEnv200x150)
algo.train(num_workers=100, use_gpu=True)
# model = ray.train.torch.prepare_model(model)

Here, MineEnv200x150 is my custom environment for a single agent. The program does execute but not scale over multiple GPUs.