I create a DQNTrainer
and load a checkpoint from a DQN agent that I trained.
config = # loaded params.pkl
config['evaluation_num_workers'] = 0 # I also tried with None, same result
trainer = DQNTrainer(config=config)
trainer.restore(checkpoint)
trainer.config['evaluation_num_episodes'] = 1
metrics = trainer.evaluate()
print(metrics)
Result:
{‘evaluation’: {‘episode_reward_max’: nan, ‘episode_reward_min’: nan, ‘episode_reward_mean’: nan, ‘episode_len_mean’: nan, ‘episode_media’: {}, ‘episodes_this_iter’: 0, ‘policy_reward_min’: {}, ‘policy_reward_max’: {}, ‘policy_reward_mean’: {}, ‘custom_metrics’: {}, ‘hist_stats’: {‘episode_reward’: , ‘episode_lengths’: }, ‘sampler_perf’: {}, ‘off_policy_estimator’: {}}}
If instead I set
config['evaluation_num_workers'] = 1
things work. But, if I do that, then the code doesn’t work for an R2D2 checkpoint anymore (see my other problem)
What could be the issue?