Tuning fcnet_hiddens with RLlib PPO ValueError: loaded state dict

Dejan_Grubisic · August 9, 2022, 4:32pm

I am trying to tune fcnet_hiddens for a simple PPO default network with ray.tune, but it fails on restoring checkpoint. Here is what I did:

hiddens_layers = [3, 10, 20]
hiddens_width = [50, 100, 500]
config['model']['fcnet_hiddens'] = tune.choice([ [w] * l for w in hiddens_width for l in hiddens_layers ])
...
analysis = tune.run(PPOTrainer, config=config, num_samples=5)
checkpoint_path = analysis.get_best_checkpoint(
            metric="episode_reward_mean",
            mode="max",
            trial=analysis.trials[0]
        )

best_config = analysis.get_best_config()
best_config['explore'] = False
agent = PPOTrainer(
    env="my_env",
    config=best_config
)

agent.restore(checkpoint_path) # <<<<<<<< This fails with error

Error:
ValueError: loaded state dict contains a parameter group that doesn’t match the size of optimizer’s group

Any idea how to tune fcnet_hiddens?

Dejan_Grubisic · August 10, 2022, 7:04pm

analysis.get_best_checkpoint

depends on the trial and we need to find the best trial and to find the best config for it. Here is the running code:

hiddens_layers = [3, 10, 20]
hiddens_width = [50, 100, 500]
config['model']['fcnet_hiddens'] = tune.choice([ [w] * l for w in hiddens_width for l in hiddens_layers ])
...
analysis = tune.run(PPOTrainer, config=config, num_samples=5)
checkpoint_path = analysis.get_best_checkpoint(
            metric="episode_reward_mean",
            mode="max",
            trial=analysis.best_trial # instead of analysis.trials[0]
        )

best_config = analysis.get_best_config()
best_config['explore'] = False
agent = PPOTrainer(
    env="my_env",
    config=best_config
)

agent.restore(checkpoint_path)

starkj · October 20, 2022, 2:58am

@Dejan_Grubisic I am seeing the same error when restoring a checkpoint for a 3-layer network. However, if my network has only 2 layers, the restore works fine. See my post at ValueError when restoring checkpoint with PPO. I suspect you are seeing the same issue I was. I’m a little sleepy, but is it possible that your restore using checkpoint_path is loading a network with different structure than what is defined in best_config?

Topic		Replies	Views
ValueError when restoring checkpoint with PPO RLlib	1	512	October 20, 2022
Compute/display actions from ray.tune RLlib	10	1682	March 30, 2021
Restoring RLlib Run Using Tuner.restore RLlib	5	622	February 17, 2024
Retraining a loaded checkpoint using Tuner.fit() with different config Ray Tune	6	1258	October 25, 2022
Loading pre-trained BC policy weight for tunning with hyper-parameter optimization Checkpointing, Restoring	1	29	August 28, 2024

Tuning fcnet_hiddens with RLlib PPO ValueError: loaded state dict

Related topics