I’ve trained agents in a Multi-Agent environment with PPO (tensorflow) and would like to analyze the behavior. I would like to assess activation maximization and concepts similar to saliency maps in image processing. To do so, I need to calculate the gradients of the input relative to the output. My question would be, if it is possible to achieve these gradient operations in RLLib.
I’ve already tried wrapping the
policy.compute_actions in a
GradientTape, but this obviously failed due to the internal data conversion / serialization. Also I’ve tried something similar to this:
# policy is initialized and loaded PPO policy batch = policy._get_dummy_batch_from_view_requirements() batch['obs'] = np.array([0.1, 0.1, 0.1, 0.1, 0.1, 0.1]).reshape(batch['obs'].shape) aout = policy.compute_actions(batch['obs'], explore=False) batch['actions'] = aout # Tried with and without: #batch = policy.postprocess_trajectory(batch) grads = policy.compute_gradients(batch)
but it crashes in many variations that I’ve tried so far. Also, I’m not sure if all operations that I would like to achieve can be done with
What would your ideas be to achieve this? Is the best option (if it is even possible) to replicate the PPO network with the weights of the checkpoint and then do all my “weird stuff” on this separate network without utilizing ray/rllib at all?
Thanks in advance!