How to define fcnet_hiddens size and number of layers in rllib tune?

I wonder how to define in rllib tune layers and neurons in the layers.
I would like to do it with two parameters:

  • number of layers 1,2 or 3
  • neurons in layer 8 to 256

Each layer has the same number of neurons for simplicity.

        "model": {
            "fcnet_hiddens": tune.choice([[32, 32,32], [64, 64], [128, 128]]),
        },

Hi @Peter_Pirog,

Here is one way to do it. You could also just use np.rand… in that lambda function instead if you wanted.

"fcnet_hiddens": tune.sample_from(lambda: [tune.choice(range(8,257)).sample()]*tune.choice(range(1,4)).sample())
1 Like

Thank You very much for the answer.

@mannyv when i try to use optuna it gives the error:

ValueError: Optuna search does not support parameters of type Function with samplers of type NoneType

Hi @Peter_Pirog,

I thought you were running with ray.tune in a grid search type setup. I would think with optuna you would specify a search config sorta like this

search_space = {
    "num_layers": tune.ray.tune.randint(1,4),
    "num_units": ray.tune.randint(8,257),
    #...
}

And then in your “objective” function you would have something like:

def objective(config):
     #...
     rllib_config["model"]["fcnet_hiddens"] = [config["num_units"]]*config["num_layers"] 
     #...
2 Likes

@mannyv @Peter_Pirog I have silly question, is it below correct? What’s objective param name from tune.run()? I don’t understand by Execution (tune.run, tune.Experiment) — Ray 1.13.0

def objective(config):
    config["model"]["fcnet_hiddens"] = [config["num_units"]]*config["num_layers"]
    return config

search_space = {
    "env": MyEnv,
    "num_layers": tune.ray.tune.randint(1,4),
    "num_units": ray.tune.randint(8,257),
    "lr": tune.uniform(5e-6, 3e-3),
}

analysis = tune.run(
    objective, # what's param name?
    config=search_space,
    run_or_experiment="PPO",
    metric='episode_reward_mean',
    mode='max',
    stop={"episodes_total": 100},
)

@mannyv @sirjay , sirjay, similar like You I don’t understand how the ‘objective’ is connected with input arguments of tune.run.
The first argument is run_or_experiment but this value is defined here:

run_or_experiment="PPO"

@mannyv @sirjay
I try to run code below:

from ray import tune
from ray.tune.suggest.optuna import OptunaSearch

def objective(config):
    config["model"]["fcnet_hiddens"] = [config["num_units"]]*config["num_layers"]
    return config

search_space = {
    #"env": MyEnv,
    "env": "LunarLander-v2",
    "num_layers": tune.randint(1,4), #tune.ray.tune.randint(1,4),
    "num_units": tune.randint(8,257), #ray.tune.randint(8,257),
    "lr": tune.uniform(5e-6, 3e-3),
}

analysis = tune.run(
    objective, # what's param name?
    config=search_space,
    search_alg=OptunaSearch(),
    run_or_experiment="PPO",
    metric='episode_reward_mean',
    mode='max',
    stop={"episodes_total": 100},
)

but the error is:

Traceback (most recent call last):
  File "/home/ppirog/projects/Mastering-Reinforcement-Learning-with-Python/custom_scripts/optuna_tune.py", line 15, in <module>
    analysis = tune.run(
TypeError: run() got multiple values for argument 'run_or_experiment'

Process finished with exit code 1

@Yard1 , Yard I found your post here:
https://github.com/ray-project/ray/issues/19698
Could You explain how to use the define-by-run interface to define conditional hyperparameters? Link to the example doesn’t work.

Hey, you can find the example here - ray/optuna_define_by_run_example.py at master · ray-project/ray · GitHub

Let me know if you need any help with it. It should be straightforward!

@Yard1 , Thank You very much for the quick answer :grinning:. Now, I will try to analyze it, and try to connect this example with rllib tuning.

1 Like

@Yard1 Is there any example with rllib? i’m not sure how to put algorithm insted of easy_objective. In typical case I use 'PPO" as algorithm name but here I have some problem with objective function.

@Yard1 I am interested too!

@sirjay I still try to combine rllib algorithm with objective function. I hope the examples below can help me somehow:
https://github.com/ray-project/ray/blob/master/rllib/examples/custom_experiment.py
https://docs.ray.io/en/latest/tune/api_docs/trainable.html#training-tune-trainable-tune-report

Hi @Peter_Pirog,

I had a few free minutes this morning. Here is an example script for how you could get this running.

There is a collab you can experiment with here: Google Colab

If you found this useful check out this link: Manny is helping people do cool things with RL

import ray
from ray import tune
from ray.rllib.agents.ppo import PPOTrainer, PPOConfig
from ray.rllib.utils.typing import TrainerConfigDict
from ray.tune.suggest.optuna import OptunaSearch

# You can ignore this. Collab does not have enough CPUs to run with the default settings. This is a workaround.
ray.shutdown()
ray.init(num_cpus=10) 


class MyPPO(PPOTrainer):
    @classmethod
    def get_default_config(cls) -> TrainerConfigDict:
        ppo_config =  PPOConfig().to_dict()
        ppo_config["num_units"] = None
        ppo_config["num_layers"] = None
        return ppo_config

    def validate_config(self, config: TrainerConfigDict) -> None:
      if config["num_layers"] is not None and config["num_units"] is not None:
        self.num_layers = config["num_layers"]
        self.num_units = config["num_units"]
        config["model"]["fcnet_hiddens"] = [self.num_units] * self.num_layers
      super().validate_config(config)

    

tune.register_trainable("MyPPO", MyPPO)

search_space = {
    "env": "LunarLander-v2",
    "num_layers": tune.randint(1,4),
    "num_units": tune.randint(8,257),
    "lr": tune.uniform(5e-6, 3e-3),
}

analysis = tune.run(
    config=search_space,
    search_alg=OptunaSearch(),
    run_or_experiment="MyPPO",
    metric='episode_reward_mean',
    mode='max',
    stop={"episodes_total": 100},
    num_samples=5,
)

print("Best hyperparameters found were: ", analysis.best_config)

ray.shutdown()
4 Likes

Nice work, coffee for you :coffee:

1 Like

How about

num_layers = [3, 10, 20]
num_nodes = [50, 100, 500]
"model": {"fcnet_hiddens": tune.choice([ [N] * L for N in num_nodes for L in num_layers ])}

Excuse me, when I run this code, I get the below error. could you help me how I could solve it?

/usr/local/lib/python3.8/dist-packages/ray/rllib/algorithms/algorithm.py in default_resource_request(cls, config)
2041
2042 # Convenience config handles.
→ 2043 cf = cls.get_default_config().update_from_dict(config)
2044 cf.validate()
2045 cf.freeze()

AttributeError: ‘dict’ object has no attribute ‘update_from_dict’

/usr/local/lib/python3.8/dist-packages/ray/rllib/algorithms/algorithm.py in default_resource_request(cls, config)
2041
2042 # Convenience config handles.
→ 2043 cf = cls.get_default_config().update_from_dict(config)
2044 cf.validate()
2045 cf.freeze()

AttributeError: ‘dict’ object has no attribute ‘update_from_dict’