I am trying to use the wrap_deepmind function on my Gym environment to generate rollouts with the deepmind-transformed observations. However, after I wrap the environment and call compute_single_action, I get a value error saying that the observation which is shaped (84, 84, 4) is outside of bounds. But the bounds correspond to the original observation space (prior to wrapping) which is (210, 160, 3). Does anyone know how to reconcile this?
Observation space conflict with wrap_deepmind
You are probably calling
Policy.compute_action_action() instead of
Algorithm.compute_single_action(). Policy does not hold a wrapped environment or preprocessor to do this transformation.