Currently, Box/MultiBinary/MultiDiscrete observation space requires the output to be an np.array, with the familiar error: ValueError: ('Observation for a Box/MultiBinary/MultiDiscrete space should be an np.array, not a Python list.'
This is bit annoying for situations where only a single value is returned. It would be nice of rllib could handle single-value cases as special case here.
Posted here as well.