Tune custom gym env_config params with PBT

I am training a RL model and trying to tune hyperparameters using PBT. by following this example:


It all works like a charm, but I can’t figure out how to tune my custom environment hyperparameters.
PBT scheduler hyperparam_mutations seems to accept only the algo hyperparameters (PPO in my case) but not the custom environment ones (see below, “base_line_degrees” is not sampled from range (0.1 , 2.0) , instead it always uses the default value (-1) set on the custom gym env init method :

register_env("custom_env", lambda config: custom_env(config))

env_config = {
    "base_line_degrees": 1

ppo_config = (
    .environment(env="custom_env", env_config=env_config)

hyperparam_mutations = {
    "lambda" : lambda: random.uniform(0.9,1.0),
    "clip_param" : lambda: random.uniform(0.01, 0.5),
    "lr": [1e-3, 5e-4, 1e-4, 5e-5, 1e-5],
    "base_line_degrees": lambda: random.uniform(0.1,2.0)

pbt = PopulationBasedTraining(

analysis = tune.run(
    metric = "episode_reward_mean", 
    mode = "max",
    scheduler = pbt,
    config = ppo_config.to_dict()

I’m clearly not doing it right. It doesn’t seems to me that I’m passing PBT any way to access the env_conf object, but I can’t see any example/documentation on how to do it.

Can custom gym hyperparams be trained with PBT? and if so, how should it be done?

Thank for your help : )