I have a relatively fundamental question. Namely, I would like to know how to load a custom RNN model that I trained on a custom environment with Tune to make predictions with. I have Tune configured to create checkpoints during training.
Suppose I have something like:
tune.run(
"PPO",
config=config,
checkpoint_freq=1,
checkpoint_at_end=True,
...
)
The model and environment are configured in config according to the tutorials and examples in Github.
I got an idea in the custom_rnn_model.py how to do manual inference using the compute_actions function, however it is not clear to me how to load a model from a checkpoint and then do manual inference with it. Are there any examples of this or can someone tell me how to do this?
Extract from custom_rnn_model.py
# To run the Trainer without tune.run, using our RNN model and
# manual state-in handling, do the following:
# Example (use `config` from the above code):
# >> import numpy as np
# >> from ray.rllib.agents.ppo import PPOTrainer
# >>
# >> trainer = PPOTrainer(config)
# >> lstm_cell_size = config["model"]["custom_model_config"]["cell_size"]
# >> env = RepeatAfterMeEnv({})
# >> obs = env.reset()
# >>
# >> # range(2) b/c h- and c-states of the LSTM.
# >> init_state = state = [
# .. np.zeros([lstm_cell_size], np.float32) for _ in range(2)
# .. ]
# >>
# >> while True:
# >> a, state_out, _ = trainer.compute_action(obs, state)
# >> obs, reward, done, _ = env.step(a)
# >> if done:
# >> obs = env.reset()
# >> state = init_state
# >> else:
# >> state = state_out
The file shows that I need to create an initial state for the RNN and then use the commonly known loop to step through the environment until the environment returns done. My question is, how can I initialize the model (or the PPOTrainer) with one of my checkpoints?