Single value output doesn't need to be array

Currently, Box/MultiBinary/MultiDiscrete observation space requires the output to be an np.array, with the familiar error: ValueError: ('Observation for a Box/MultiBinary/MultiDiscrete space should be an np.array, not a Python list.' This is bit annoying for situations where only a single value is returned. It would be nice of rllib could handle single-value cases as special case here.

Posted here as well.