Target_network_update_freq APEX vs DQN

Nicolas_Carrara · May 30, 2022, 8:39am

How severe does this issue affect your experience of using Ray?

I want to update my target network every 1000 steps. One episode lasts for 1000 steps. So I expect 1 update per episode.

On DQN it is straightforward, I set target_network_update_freq=1000 and that’s it.

On APEX-DQN, it depends on a lot of parameters, namely train_batch_size, training_intensity, num_workers and ofc target_network_update_freq.

How can I define properly those parameters to get 1 update per episode fed to the learner on APEX? The documentation does not say anything about it.

See the tensorboard metric bellow where I’ve tried different combination of parameters for APEX, vs DQN (in teal)

apex_vs_dqn

Nicolas_Carrara · May 30, 2022, 9:16am

I have upgraded to the last commit on master and ray wheel 3.0 and it works as excepted

Topic		Replies	Views
Understanding the Stopping Process for ray.rllib.agents.dqn.DQNTrainer.train() RLlib	4	571	May 26, 2021
Restroing Checkpoint Does Not Include Target Net RLlib	0	247	January 2, 2021
Training frequency in DQN rllib	0	157	February 12, 2024
DQN in RLlib not leading to the same results as Vanilla PyTorch Implementation Configure Algorithm, Training, Evaluation, Scaling	0	330	June 21, 2023
'timesteps_per_iteration' parameter RLlib	1	779	July 21, 2021