How severe does this issue affect your experience of using Ray?
Low: It annoys or frustrates me for a moment.
I have many observations (~1M) and want to get actions from policy by calling
trainer.get_policy().compute_actions
I wrote something like
for i in range(1000000):
trainer.get_policy().compute_actions(diff_obs)
It take several minutes to complete it and is time-consuming. I wonder if there are better ways to accelerate the process with more evaluation workers?
to increase the number of evaluation workers you need to modify your Trainer config by
config["evaluation_num_workers"] = 8
# Change also this to increae the evaluation runs.
config["evaluation_duration"] = 1000000
config["evaluation_duration_unit"] = "timesteps"
If you run then
trainer.evaluate()
The evaluation is run in parallel. If you want to tore the actions however, this is not possible during evaluation. There it might help to increase the number of workers via