Configuring Ray RLib to only use the driver fails for PPO

LukasNothhelfer · October 31, 2021, 12:54am

I just probably stumbled upon a small bug in the ray/rllib/agents/ppo/ppo.py file in the validate_config function. It is about this code snippet:

num_workers = config["num_workers"] or 1
    calculated_min_rollout_size = \
        num_workers * config["num_envs_per_worker"] * \
        config["rollout_fragment_length"]
    if config["train_batch_size"] > 0 and \
            config["train_batch_size"] % calculated_min_rollout_size != 0:

I want to configure Ray/Tune to use only the driver for training, rollouts and evaluation. Accordingly, I have made the following configuration:

("num_workers", 0),  
("num_envs_per_worker", 0),  
("num_cpus_per_worker", 0), 
("num_gpus_per_worker", 0),
("custom_resources_per_worker", {}),
("evaluation_num_workers", 0),  
("num_cpus_for_driver", 1),  
("create_env_on_driver", True)

But this throws an error because the calculated_min_rollout_size is always zero:

File "/home/lukas/anaconda3/lib/python3.8/site-packages/ray/rllib/agents/ppo/ppo.py", line 138, in validate_config
    config["train_batch_size"] % calculated_min_rollout_size != 0:
ZeroDivisionError: integer division or modulo by zero

mannyv · October 31, 2021, 7:31pm

Hi @LukasNothhelfer,

Setting num_workers to zero is sufficient to use only the driver. You do not need to zero out the rest of the num_*_worker keys. In fact I think you might mess up the driver config too because when num_workers is 0 it treats the driver as a worker.

LukasNothhelfer · October 31, 2021, 7:47pm

@mannyv I had to set num_envs_per_worker to 1
This configuration works now:

("num_workers", 0),  
("num_envs_per_worker", 1),  
("num_cpus_per_worker", 0),  
("num_gpus_per_worker", 0),
("custom_resources_per_worker", {}),
("evaluation_num_workers", 0), 
("num_cpus_for_driver", 1),  
("create_env_on_driver", True),

The semantic of the validation function and configuration has confused me a bit.

mannyv · October 31, 2021, 10:15pm

@LukasNothhelfer,
Me too; me too.

Topic		Replies	Views
PPO configuration parameters: num_rollout_workers & train_batch_size Configure Algorithm, Training, Evaluation, Scaling	1	748	November 2, 2023
Training and inference ONLY using GPUs and no CPUs RLlib	7	1862	April 12, 2021
Issues after upgrading from 1.6.0 fro 1.7.0 RLlib	3	402	October 17, 2021
Error when running on GPU RLlib	9	2271	February 23, 2022
Help with ppo config in multiagent env with complex observations Configure Algorithm, Training, Evaluation, Scaling	0	38	April 11, 2025

Configuring Ray RLib to only use the driver fails for PPO

Related topics