I am trying to use the wrap_deepmind function on my Gym environment to generate rollouts with the deepmind-transformed observations. However, after I wrap the environment and call compute_single_action, I get a value error saying that the observation which is shaped (84, 84, 4) is outside of bounds. But the bounds correspond to the original observation space (prior to wrapping) which is (210, 160, 3). Does anyone know how to reconcile this?
You are probably calling Policy.compute_action_action()
instead of Algorithm.compute_single_action()
. Policy does not hold a wrapped environment or preprocessor to do this transformation.