Hi all, hope that you can help
TL;DR: How technically can I process, at the same “forward” pass both observations from the environment and some other hard-coded observations to compute some intrinsic loss (but using the same model for both passes).
I’m using a custom A3C model, and I want to hard-coded add the processing of other observations and calculate some intrinsic loss and the insert this value to the general loss (s.t I’ll have the gradients work for the actual observed states and also my hard coded ones).
It seems that I have everything working (with dummy variables) except:
When I call in “forward” to “compute_intrinsic_reward” which in turn take some observations and call to self.actions_model.forward_rnn(…) the run gets stuck until it runs out of memory and I don’t know where exactly it breaks.
Do you have any idea? I guess hardcoded calling to the main model in “forward” mess things up but I don’t know how.
Using TF, and the “actions_model” inherits from RecurrentTFModelV2.
EDIT:
I think that even just explaining if and how can I run (during “forward”) some hard coded observations into “forward_rnn” would be very helpful!