Evaluation during training without local env_runner

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I have a very unique circumstance where multiple instances of my training environment can’t exist on the same process. Fortunately, during training I can simply make sure that each env_runner only has one environment to run so this issue is never a problem.

However, we’ve been trying to implement evaluation into our training recently and by default the algorithm creates two env_runner_groups which both create a local env_runner. Those two local env_runners then instantiate our env on the same process and everything crashes.

Was hoping to get some advice on how to solve this issue. I think I can create a custom Algorithm class that avoids creating the local env_runners but maybe there’s a config somewhere that I’m missing?