Hi everyone,
I am trying to distinguish the commands for ‘hyperparameter tuning of a rllib algorithm’ from the commands for ‘one-time execution of same algorithm with a constant pre-defined values of needed hyperparameters’.
The page https://docs.ray.io/en/master/ray-overview/index.html#gentle-intro
shows a PPO example that uses tune.run() without any hyperparameter space, and the page https://docs.ray.io/en/master/tune/examples/pbt_ppo_example.html shows a PPO example of using tune.run() with a hyperparameter space (and a scheduler).
From this observation, I understand that same tune.run method can be used for both hyperparameter tuning and one-time RL training. If we provide hyperparameter search space and scheduler, it is tuning otherwise it is a single execution of RL algorithm. Please let me know whether my understanding is correct or not.