Errors in Hyperparameter tuning of PPO with Bayesian Optimization (ray.tune)

I am working with ray.rllib and trying to tune the hyperparamters using “search_alg=BayesOptSearch()” on CartPole env with PPO as shown below.

Case 1: I have passed the “gamma” parameter values using tune.uniform() space as shown below and it had succesfully ran untill the stop criteria is met.

import ray
from ray import tune
from ray.tune.suggest.bayesopt import BayesOptSearch
from ray.tune.suggest import ConcurrencyLimiter
import numpy as np

ray.init(num_cpus=2)
target = 4000
alg = BayesOptSearch(metric="episode_reward_mean", mode="max")
analysis = tune.run(
    "PPO",
    stop={"timesteps_total": target},
    metric="episode_reward_mean",
    mode="max",
    config={
        "env": 'CartPole-v0',
        "use_gae": True,
        "num_workers": 1,  # run in a single process
        "num_gpus": 0,
        "lr": 1e-4,
        "gamma":  tune.uniform(0.95, 0.97),
        "entropy_coeff": 0.01,

    },
    search_alg=alg,
    num_samples=1,

 )
print("best hyperparameters: ", analysis.best_config)

Case 2: I have passed the “gamma” and “entropy_coeff” parameter values using tune.uniform() space as shown below. This is raising an error as given below.

import ray
from ray import tune
from ray.tune.suggest.bayesopt import BayesOptSearch
from ray.tune.suggest import ConcurrencyLimiter
import numpy as np

ray.init(num_cpus=2)
target = 4000
alg = BayesOptSearch(metric="episode_reward_mean", mode="max")
analysis = tune.run(
    "PPO",

    stop={"timesteps_total": target},
    metric="episode_reward_mean",
    mode="max",
    config={
        "env": 'CartPole-v0',
        "use_gae": True,
        "num_workers": 1,  # run in a single process
        "num_gpus": 0,
        "lr": 1e-4,
        "gamma":  tune.uniform(0.95, 0.97),
        "entropy_coeff": tune.uniform(0.01, 0.1),
    },
    search_alg=alg,
    num_samples=1,

 )
print("best hyperparameters: ", analysis.best_config)

In this case when I am trying to use tune.uniform() space for passing paramter values to tune.run() following error is raising.

File “C:\Users\AppData\Roaming\Python\Python39\site-packages\tensorflow\python\util\traceback_utils.py”, line 153, in error_handler
pid=20312) raise e.with_traceback(filtered_tb) from None
pid=20312) File “C:\Users\AppData\Roaming\Python\Python39\site-packages\tensorflow\python\framework\op_def_library.py”, line 550, in _apply_op_helper
pid=20312) raise TypeError(
pid=20312) TypeError: Input ‘y’ of ‘Mul’ Op has type float32 that does not match type float64 of argument ‘x’.

Case 3: I have passed the “num_sgd_iter” parameter valueas using tune.randint() space.

import ray
from ray import tune
from ray.tune.suggest.bayesopt import BayesOptSearch
from ray.tune.suggest import ConcurrencyLimiter
import numpy as np

ray.init(num_cpus=2)
target = 4000
alg = BayesOptSearch(metric="episode_reward_mean", mode="max")
analysis = tune.run(
    "PPO",
    local_dir="./tb_logs/",
    stop={"timesteps_total": target},
    metric="episode_reward_mean",
    mode="max",
    config={
        "env": 'CartPole-v0',
        "use_gae": True,
        "num_workers": 1,  # run in a single process
        "num_gpus": 0,
        "lr": 1e-4,
        "gamma":  tune.uniform(0.95, 0.97),
        "entropy_coeff": 0.01,
        "num_sgd_iter": tune.randint(5,10),
    },
    search_alg=alg,
    num_samples=1,

 )
print("best hyperparameters: ", analysis.best_config)

This is raising an error as follows
image

ValueError: BayesOpt does not support parameters of type Integer

Can anyone help me in understanding what does this errors state and how to solve it?
Which type of search space must be used for passing parameter values to BayesOptSearch().

I worked on python 3.9 including the following list of libraries,
tensorflow==2.7.0
keras==2.7.0
tensorboard==2.7.0
gym==0.21.0
ray[rllib]==1.9.2
dm-tree==0.1.6
ray[tune]
ray[default]
bayesian-optimization

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

It looks like BayesOpt only allows search across continuous spaces, but you’re doing a search across a discrete space for your num_sgd_iter. You’ll either need to not do search on that parameter when using BayesOpt or use one of the other search algorithms: Tune Search Algorithms (tune.search) — Ray 2.8.0

Thank you for the reply. I understood it. Keeping num_sgd_iter a side, even it is not working for “gamma” and “entropy_coeff” though I gave them tune.uniform() which would create a continuous space.