Restroing Checkpoint Does Not Include Target Net

Hi everyone,

I have already opened an issue to the github page:

I believe there is a bug while restoring Apex-DQN agent by restoring only the actual network, not the target one which causes high td_error and resultantly low rewards after restore.

I read all the discussions about checkpointing and resuming the training but cannot come up with a solution.

One short-term workaround would be storing the weights by myself and force target network with ._set_weights API…

Any ideas or recommendations?