I would like to perform a custom evaluation (without any training) following the given script:
For instance, this could be the case when re-evaluating a policy with other parameters.
Unfortunately, an environment responsible for sample collection is always created either on the local worker or on a remote worker and is not used at any time… The problem is that my env creation is costly. I just need an for the evaluation part.
Would it be possible to disable this unnecessary env creation when performing an evaluation ? I have digged into ray/rllib/evaluation/rollout_worker.py but and there is no way it could avoided.
During a training, I have one env for sampling data and one for evaluating the policy. Once the training is done, I would like to take back a checkpoint and perform an evaluation with it. I only need one env. Nevertheless, I am not able to do so. An env for sampling is always created (even if it is not used). I am just trying to do something like this:
agent = eval_config["trainer"](config=agent_config)
for it, checkpoint_path in enumerate(eval_config["checkpoint_path"]):
if checkpoint_path is not None:
agent.restore(checkpoint_path)
results = agent.evaluate()
Whatever the value of num_workers, 0 or 1, I notice that the env is initialised twice. There is one env for the remote rollout worker which performs the evaluation, which is fine, and one which is idle as I have explained before.