How to run multiple experiments together?

yiwc · December 27, 2021, 12:26am

Screenshot 2021-12-27 at 8.23.45 AM

THanks, team!

I notice, sometime there are only one experiment running, sometimes there are 2 experiments running at the same time, may I know what arguments can effect decide this?

Roller44 · December 27, 2021, 7:14am

Did you use Tune? What is your resource allocation settings?

yiwc · December 27, 2021, 4:46pm

Hi Roller44, THanks for your reply!! My script looks like this

config["num_gpus"] = 4
config["num_workers"] = 30
config["num_envs_per_worker"] =4
config["rollout_fragment_length"] =100
...skip a few other configs...

ray.init(num_cpus=40)
analysis = tune.run(
        ppo.PPOTrainer,
        config=config,
        local_dir=log_dir_play_ray,
        stop=stop,
        checkpoint_at_end=True,
        name=exp_name,
 )
ray.shutdown()

Roller44 · December 28, 2021, 2:31am

Your configuration requires 30 * 4 = 120 CPUs (i.e., each of 30 workers requires 4 CPUs to run 4 environments, where, normally, each environment requires 1 CPU) to run all experiments in parallel but you only have 40 CPUs. So you should decrease num_workers or num_env_per_worker.

The bottom line is, the number of experiments running in parallel is equal to:

<your_total_number_of_cpus> / (num_cpu_per_worker x num_workers x num_envs_per_worker + num_cpus_for_driver),
<your_total_number_of_gpus> / (num_gpu_per_worker x num_workers + num_gpus).

yiwc · December 29, 2021, 12:51am

great! Thanks! I will have a try

Topic		Replies	Views
How do I run my experiment on a single GPU?	4	1691	August 20, 2023
Parallelly running experiments with Ray Tune on a single Machine Ray Tune	8	109	March 6, 2025
Solving multiple trials with tune.grid_search() RLlib	4	339	March 4, 2022
Total Workers == (Number of GPUS) - 1? Configure Algorithm, Training, Evaluation, Scaling	1	1183	February 9, 2023
Training trials in parallel on multi-gpu machine Ray Tune	8	1702	August 23, 2021

How to run multiple experiments together?

Related topics