I am working with ray.rllib and trying to run PPO on “CartPole-v0” env for different parameter combinations using tune.grid_search() as shown below in the code . But this is creating different trails for different parameter combinations initially and then running all of them in parallel. Is there any solution for running trials one after the other? Like in detail, first trial should completely run until the stop criteria is fulfilled and then it should start to run the other trial. Please, can anyone help me out with this?
Ok, Thank you for the answer.
I have tried both ways
Case 1: ray_init(num_cpus =1) and num_workers: 0
Case 2: ray_init(num_cpus=1) and num_workers:1
“Case 1” was perfect and smoothly running the way I wanted but in “Case 2” all the trials were under PENDING status for long time and not at all the status was changing to RUNNING.
The issue with the second one is that you provided fewer cpus than required. Usually the requirement is num_workers+1. Basically what you need to balance is providing enough cpus for the number of workers but not so many that you can support more than 1 trial.
Realize, that if you set num_workers to <=1 then you will not have any parallelism when sampling new experiences.