How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hello all,
I would like to compare the performance of multiple algorithms, let’s say ppo
and dqn
, for my custom environment.
I know I can manually execute my training pipeline multiple times for each algorithm in order to achieve a statistically significant comparison, and then compare their respective performances. However, I believe there might be a more efficient approach.
I wonder if anyone knows what is the best way to perform this comparison in RLlib?
I would like to run my training pipeline only once in which each algorithm is trained for example n
times on the same env, and then compare the performance of the algorithms.
Thanks!
Hi, You can create a script that iterates over the algorithms Or if you want to do them in parallel you can use ray core to launch those experiments in parallel (assuming you have enough resources to run them concurrently)
Hi @kourosh,
Thanks for your quick reply.
create a script that iterates over the algorithms
: yes, this is what I’m doing now, but I thought there might be a better way to do that; like hyperparameter tuning in Ray
.
I wonder if you know if I can use wandb
to perform this? At least, wandb
makes multiple runs, and then provides me with a dashboard where doing comparisons is easy. However, I’m not sure whether the WandbLoggerCallback
supports this or not.
Thanks!
You can use Ray Tune out of the box to sweep hyper-parameters within a single algorithm. If you want to sweep the algorithm itself you need to create a function / trainable that runs the algorithm and optionally use tune to sweep the parameters of that.
Regarding the wandb you can also create the WandbLoggerCallback with different groups that will separate let’s say DQN from PPO but still allows you to compare them against each other.
1 Like
Many thanks @kourosh,
I wonder if you have any script showing how to modify WandbLoggerCallback
for such a scenario?
I’ve figured it out! It was easier than I expected!
This is how I implemented it:
You can have two for
loops: one for algos (n_runs=2 in my implementation) and one for the number of times you want to run each algo to have a good statistical comparison (n_trials=5 in my implementation). Like this:
for run_id in range(n_runs):
for trial_id in range(n_trials):
Then, in the RunConfig
you can set the callbacks
with WandbLoggerCallback
, while passing the project
and group
arguments. Like this:
run_config = air.RunConfig(
callbacks=[
WandbLoggerCallback(project=project_name,
group=f"algo_{algo_name}__run_{run_id}__trial_{trial_id}"
)
)
Next, on your wandb
dashboard you will see the following figures, and of course, you can compare them by grouping
button of the wandb
’s figure edit panel.
I hope this is useful for others as well!
1 Like