'Observation for a Box/MultiBinary/MultiDiscrete space should be an np.array, not a Python list.'

Getting this error with my observation space. Here is the code:

bounds = np.array([
            np.finfo(np.float32).max,
            np.finfo(np.float32).max,
            np.finfo(np.float32).max
        ])
        self.observation_space = Box(-bounds, bounds, dtype=np.float32)

Is this not how you would correctly set up an observation space? I tested by having an observation be np.array([0, 0, 0]) but it’s still giving me this. Any ideas?

@Yared_Kokeb,

I would try the following:

self.observation_space = Box(low=-np.Inf, high=np.Inf, shape=(3,), dtype=np.float32)

Hope this works for what you want to achieve.

Best,
Simon

Hey Simon,

Thanks so much for getting back to me. I’ve tried testing with an extremely crude environment, but I’m getting the same error.

import gym
from gym.spaces import Discrete, Box, Dict
from gym.utils import seeding
import numpy as np

class DummyEnv(gym.Env):
    def __init__(self, config): 
        self.observation = None
        self.reward = 0
        self.done = False

        num_obs = 3 # Displacement between scim and enem, Displacement between enem and area, num of enemies left
        self.action_space = Discrete(5)
        self.observation_space = Box(low=-np.Inf, high=np.Inf, shape=(num_obs,), dtype=np.float32)
        self.seed()

    def reset(self): 
        self.observation = None
        self.reward = 0
        self.done = False

    def seed(self, seed=None): 
        """
        Sets the seed for this env's random number generator
        """
        self.rand, seed = seeding.np_random(seed)
        return [seed]

    def step(self, action): 
        assert action in [i for i in range(0, self.n_actions)]
        
        if action == 0:
            self.reward += 1
        elif action == 1:
            self.reward += 2
        elif action == 2:
            self.reward += 3
        elif action == 3:
            self.reward += 4
        elif action == 4:
            self.reward += 5

        self.observation = np.zeros(3)

        if self.reward >= 10:
            self.done = True
        
        assert isinstance(self.observation, np.ndarray), "Observation should be a numpy array"
        return self.observation, self.reward, self.done, {}
    
    def close(self):
        pass

Hi Yared,

I tested your code and it ran. I created a DummyEnv object and also an observation space, according to your initial snippet.

gym==0.18.3
numpy==1.19.5

Cheers

Hey arturn,

I was able to get this resolved by simply returning an observation in the reset function, as that was causing my error to appear. How you managed to train the environment without doing that is beyond me.

Best regards,
Peter Parker

Hi,
Your initial post did not say that the error occurred during training so I just ran your code, created all objects and it did not throw the error that you posted. Apart from that: Yes, the gym API dictates that the reset function must return an initial observation. But the env you posted returns None, right? Not an np.array, but neither a python list.
Cheers