Errors in Hyperparameter tuning of PPO with Bayesian Optimization (ray.tune)

sai_lalith_Polawar · March 14, 2022, 1:27pm

I am working with ray.rllib and trying to tune the hyperparamters using “search_alg=BayesOptSearch()” on CartPole env with PPO as shown below.

Case 1: I have passed the “gamma” parameter values using tune.uniform() space as shown below and it had succesfully ran untill the stop criteria is met.

import ray
from ray import tune
from ray.tune.suggest.bayesopt import BayesOptSearch
from ray.tune.suggest import ConcurrencyLimiter
import numpy as np

ray.init(num_cpus=2)
target = 4000
alg = BayesOptSearch(metric="episode_reward_mean", mode="max")
analysis = tune.run(
    "PPO",
    stop={"timesteps_total": target},
    metric="episode_reward_mean",
    mode="max",
    config={
        "env": 'CartPole-v0',
        "use_gae": True,
        "num_workers": 1,  # run in a single process
        "num_gpus": 0,
        "lr": 1e-4,
        "gamma":  tune.uniform(0.95, 0.97),
        "entropy_coeff": 0.01,

    },
    search_alg=alg,
    num_samples=1,

 )
print("best hyperparameters: ", analysis.best_config)

Case 2: I have passed the “gamma” and “entropy_coeff” parameter values using tune.uniform() space as shown below. This is raising an error as given below.

import ray
from ray import tune
from ray.tune.suggest.bayesopt import BayesOptSearch
from ray.tune.suggest import ConcurrencyLimiter
import numpy as np

ray.init(num_cpus=2)
target = 4000
alg = BayesOptSearch(metric="episode_reward_mean", mode="max")
analysis = tune.run(
    "PPO",

    stop={"timesteps_total": target},
    metric="episode_reward_mean",
    mode="max",
    config={
        "env": 'CartPole-v0',
        "use_gae": True,
        "num_workers": 1,  # run in a single process
        "num_gpus": 0,
        "lr": 1e-4,
        "gamma":  tune.uniform(0.95, 0.97),
        "entropy_coeff": tune.uniform(0.01, 0.1),
    },
    search_alg=alg,
    num_samples=1,

 )
print("best hyperparameters: ", analysis.best_config)

In this case when I am trying to use tune.uniform() space for passing paramter values to tune.run() following error is raising.

File “C:\Users\AppData\Roaming\Python\Python39\site-packages\tensorflow\python\util\traceback_utils.py”, line 153, in error_handler
pid=20312) raise e.with_traceback(filtered_tb) from None
pid=20312) File “C:\Users\AppData\Roaming\Python\Python39\site-packages\tensorflow\python\framework\op_def_library.py”, line 550, in _apply_op_helper
pid=20312) raise TypeError(
pid=20312) TypeError: Input ‘y’ of ‘Mul’ Op has type float32 that does not match type float64 of argument ‘x’.

Case 3: I have passed the “num_sgd_iter” parameter valueas using tune.randint() space.

import ray
from ray import tune
from ray.tune.suggest.bayesopt import BayesOptSearch
from ray.tune.suggest import ConcurrencyLimiter
import numpy as np

ray.init(num_cpus=2)
target = 4000
alg = BayesOptSearch(metric="episode_reward_mean", mode="max")
analysis = tune.run(
    "PPO",
    local_dir="./tb_logs/",
    stop={"timesteps_total": target},
    metric="episode_reward_mean",
    mode="max",
    config={
        "env": 'CartPole-v0',
        "use_gae": True,
        "num_workers": 1,  # run in a single process
        "num_gpus": 0,
        "lr": 1e-4,
        "gamma":  tune.uniform(0.95, 0.97),
        "entropy_coeff": 0.01,
        "num_sgd_iter": tune.randint(5,10),
    },
    search_alg=alg,
    num_samples=1,

 )
print("best hyperparameters: ", analysis.best_config)

This is raising an error as follows

ValueError: BayesOpt does not support parameters of type Integer

Can anyone help me in understanding what does this errors state and how to solve it?
Which type of search space must be used for passing parameter values to BayesOptSearch().

I worked on python 3.9 including the following list of libraries,
tensorflow==2.7.0
keras==2.7.0
tensorboard==2.7.0
gym==0.21.0
ray[rllib]==1.9.2
dm-tree==0.1.6
ray[tune]
ray[default]
bayesian-optimization

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

avnishn · March 16, 2022, 12:54am

It looks like BayesOpt only allows search across continuous spaces, but you’re doing a search across a discrete space for your num_sgd_iter. You’ll either need to not do search on that parameter when using BayesOpt or use one of the other search algorithms: Tune Search Algorithms (tune.search) — Ray 2.8.0

sai_lalith_Polawar · March 18, 2022, 7:55am

Thank you for the reply. I understood it. Keeping num_sgd_iter a side, even it is not working for “gamma” and “entropy_coeff” though I gave them tune.uniform() which would create a continuous space.

Topic		Replies	Views
Solving multiple trials with tune.grid_search() RLlib	4	339	March 4, 2022
A little help for a novice RLlib	1	429	October 26, 2022
ValueError in simple Tuner/Pytorch prototype RLlib	4	2676	October 12, 2022
[Tune] How to apply bayesian optimization into RL? Ray Tune	2	409	August 11, 2021
Possibly Checkpoint error while running Ray tune	4	1230	December 2, 2022

Errors in Hyperparameter tuning of PPO with Bayesian Optimization (ray.tune)

Related topics