I have a relatively fundamental question. Namely, I would like to know how to load a custom RNN model that I trained on a custom environment with Tune to make predictions with. I have Tune configured to create checkpoints during training.
Suppose I have something like:
tune.run(
"PPO",
config=config,
checkpoint_freq=1,
checkpoint_at_end=True,
...
)
The model and environment are configured in config according to the tutorials and examples in Github.
I got an idea in the custom_rnn_model.py how to do manual inference using the compute_actions
function, however it is not clear to me how to load a model from a checkpoint and then do manual inference with it. Are there any examples of this or can someone tell me how to do this?
Extract from custom_rnn_model.py
# To run the Trainer without tune.run, using our RNN model and
# manual state-in handling, do the following:
# Example (use `config` from the above code):
# >> import numpy as np
# >> from ray.rllib.agents.ppo import PPOTrainer
# >>
# >> trainer = PPOTrainer(config)
# >> lstm_cell_size = config["model"]["custom_model_config"]["cell_size"]
# >> env = RepeatAfterMeEnv({})
# >> obs = env.reset()
# >>
# >> # range(2) b/c h- and c-states of the LSTM.
# >> init_state = state = [
# .. np.zeros([lstm_cell_size], np.float32) for _ in range(2)
# .. ]
# >>
# >> while True:
# >> a, state_out, _ = trainer.compute_action(obs, state)
# >> obs, reward, done, _ = env.step(a)
# >> if done:
# >> obs = env.reset()
# >> state = init_state
# >> else:
# >> state = state_out
The file shows that I need to create an initial state for the RNN and then use the commonly known loop to step through the environment until the environment returns done
. My question is, how can I initialize the model (or the PPOTrainer
) with one of my checkpoints?