How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I am simple trying to evaluate a model from a checkpoint for 10 episodes. I am unable to use tune for evaluation as it just seems impossibly complicated to manage the number of episodes it will run - I have no idea why.
Instead, I have moved to try using trainer.evaluate() after restoring my checkpoint using trainer.restore(). This seems to do what I need, minus the auto-generated result information (in Tune) and the fact that it always runs 1 more episode than that defined in the evaluation_duration config parameter.
Why is this the case? And how can I fix it? These are the evaluation-related configuration options I have set:
# Evaluation settings
policy_conf['evaluation_interval'] = 0
policy_conf['evaluation_duration'] = 10 # change to 1 episode?
policy_conf['evaluation_duration_unit'] = 'episodes'
policy_conf['evaluation_parallel_to_training'] = False
policy_conf['in_evaluation'] = False
policy_conf['evaluation_config'] = {}
policy_conf['evaluation_num_workers'] = 1
policy_conf['custom_eval_function'] = None
policy_conf['always_attach_evaluation_results'] = True
policy_conf['sample_async'] = False
Let me know if any other information is required. I would appreciate any help at all, on the matter. Thank you.