Strange behavior of Apex framework

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

Hello,

In my project, I’m training policy on the real industrial robot. Using algorithms like SAC or DDPG violates the real-time execution (even with tuned rollout_fragment_length and the training_intensity). To achieve execution with proper frequency (10Hz) I integrated SAC with the Apex framework. Everything works great, however, I’ve noticed that from time to time robot stops moving while the gradients are computed and the opposite, the robot is moving while the gradients are not computed.

In this setup, I’m using only 1 actor that corresponds to the 1 robot. My question is, is there possible to modify apex in a way to remove those above-mentioned stops or to modify SAC for asynchronous gradients computation and data collection?

I’m using ray 2.1.0.

Greg.

@Souphis How many rollout workers are you using? Workers can potentially stop while syncing weights or computing gradients. If there is no worker, there is no way to compute gradients while collecting experiences.

@arturn I’m using only one rollout worker because I do only have one robot for experiments. So, this delay is caused by this number?

So you use num_rollout_workers=1?
Because with num_rollout_workers=0, only the local worker will be able to sample, which is the same actor that learns.

Yes, in this case, I’m using the config where num_rollout_workers=1. Also, maybe there is another way to achieve asynchronous training for off-policy algorithms?