Hi everyone.
I have trained a PPO agend using:
tune.run(
run_or_experiment="PPO",
config={
"env": "Battery",
"num_gpus" : 1,
"num_workers": 13,
"num_cpus_per_worker": 1,
"train_batch_size": 1024,
"num_sgd_iter": 20,
'explore': True,
'exploration_config': {'type': 'StochasticSampling'},
},
stop={'episode_reward_mean': 0.15},
checkpoint_freq = 200,
local_dir = 'second_checkpoints'
)
I want to be able to load an agent from a checkpoint and display the actions associated with each step taken.
Previously, I used PPOTrainer, however i have used tune in order to distribute the training.
How can I extract the agent from the ray.tune checkpoint to perform a test and see what the actions taken look like?:
while not done:
action, state, logits = agent.compute_action(obs, state)
obs, reward, done, info = env.step(action)
episode_reward += reward
1 Like
Hey @Carterbouley , great question!
You can do this after tune completed training:
trained_trainer = PPOTrainer(config=config)
trained_trainer.restore("[your checkpoint]")
# Then your while loop:
while not done:
action, state, logits = trained_trainer.compute_action(obs, state)
obs, reward, done, info = env.step(action)
episode_reward += reward
3 Likes
Hi @sven1977 Sven,
Thanks so much for coming back to me. I was unsure if my config for tune is applicable for my config for PPOTrainer. I now get an error stating PPO object has no attribute ‘optimizer’.
Is there a way to fix this that you know of?
Thanks in advance for your help!
The stack trace is here if it helps!
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-20-c721b7d26a42> in <module>
----> 1 trained_trainer.restore('checkpoint-12000')
~/Documents/rl_agent/ray-venv/lib/python3.8/site-packages/ray/tune/trainable.py in restore(self, checkpoint_path)
364 self.load_checkpoint(checkpoint_dict)
365 else:
--> 366 self.load_checkpoint(checkpoint_path)
367 self._time_since_restore = 0.0
368 self._timesteps_since_restore = 0
~/Documents/rl_agent/ray-venv/lib/python3.8/site-packages/ray/rllib/agents/trainer.py in load_checkpoint(self, checkpoint_path)
695 def load_checkpoint(self, checkpoint_path: str):
696 extra_data = pickle.load(open(checkpoint_path, "rb"))
--> 697 self.__setstate__(extra_data)
698
699 @DeveloperAPI
~/Documents/rl_agent/ray-venv/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py in __setstate__(self, state)
175 @override(Trainer)
176 def __setstate__(self, state):
--> 177 Trainer.__setstate__(self, state)
178 self.train_exec_impl.shared_metrics.get().restore(
179 state["train_exec_impl"])
~/Documents/rl_agent/ray-venv/lib/python3.8/site-packages/ray/rllib/agents/trainer.py in __setstate__(self, state)
1238 r.restore.remote(remote_state)
1239 if "optimizer" in state:
-> 1240 self.optimizer.restore(state["optimizer"])
1241
1242 @staticmethod
AttributeError: 'PPO' object has no attribute 'optimizer'
I have been searching but I don’t see anyone else having posted this problem!
Thanks in advance
Hmm, strange. Could you send me a short reproduction script that shows this behavior?
Then I can debug. Never seen this in our train/save/restore tests.
Thats a very generous offer Sven, Thank you. I have sent you an email with further information!
Hi Sven,
I sent you an email and a message through this platform, could you confirm if you have received either?
Thanks in advance!
Hi @sven1977, just wondering if I have sent you enough details via email to be able to recreate the problem?
Thanks,
Carter
Or if any of the other Ray team is available to help, that would be great too! Thanks!
Hi Guys
I believe I fixed this. I was using ray 0.8.3 to train and 1.2 when trying to restore. Restoring in 0.8.4 fixed this.
3 Likes
@Carterbouley , this is great! Sorry, hadn’t had time to look at this yet. Thanks for sharing a solution.
1 Like