Customise policy to only do forward/backward pass for certain observations

Hi!

Thanks for the reply.
That’s what I ended up doing and it works.
However, as far as I understand, this still requires forward/backward pass, causing an overhead. I tried to solve the issue by customising the compute_single_action in the PPO trainer (post) but that did not work.