Hardware requirements and setup for running performant APE-X

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hello! I am a new rllib user and I would like to train APE-X agents on ALE. I have a few questions that I hope you can answer:

  1. Currently, I have access to a machine with 64 CPUs (252GB of RAM) and 1 GPU (NVIDIA 3090Ti, 24GB RAM). How should I pick num_rollout_workers and num_envs_per_worker per my hardware specs?
  2. If I want to get to about 40k frames per second with APE-X, what kind of compute infrastructure would I need? How many GPUs and how many CPUs? What about RAM?
  3. Conceptual question: my understanding is that APE-X will create a learner on the GPU (that computes gradients and updates network parameters) and several actors on several CPUs. How do these actors determine the actions they need to send to the env? Do each of the actors have a copy of the Q-network? Or do they ask the Q-network on the GPU for actions? Are the results in the arxiv paper for RLlib obtained by running the actors on CPU or GPU?

Thank you in advance!

Hello RLLib team, sorry to bother you again, but I am stuck on this and would really appreciate some insight. Figure 5b says that with 64 workers it is possible to get a speed of 40k fps; the description says that 1 V100 GPU was used. But I cannot figure out how many CPUs and how much CPU RAM I would need to run a similar experiment with 64 workers.