Observation Space Issue with weird (?,val) shape

omarelseadawy · November 11, 2022, 7:44pm

Hi,
I was recently working on a research project for my graduate degree and I was using RLLIB with a custom environment. I trained different models on my custom environment and worked out fine (PPO, DQN, ARS, and custom model). However, my issues starts to appear when I want to evaluate.

One example is using the PPO training from the documentation with default configurations not much change, the training phase goes well, the reward increases and all is okay. When I come to the evaluation part, I keep facing this error:

ValueError: Cannot feed value of shape (61,) for Tensor default_policy/obs:0, which has shape (?, 61)

Here’s how I do my evaluation:

and this is my Observation space defined in the environment:

        self.observation_space_dict = Dict({
            'action_mask': Box(0, 1, shape=(self.buffer_length,),dtype=np.float32),
            'avail_actions': Box(-np.inf, np.inf, shape=(self.buffer_length,),dtype=np.float32),
            'Online_Buffer': Box(low=-2, high=np.inf ,shape=(self.buffer_length,),dtype=np.float32),
            'C_jobs': MultiBinary(self.buffer_length),
            'RemLaxity_jobs': Box(low=-np.inf, high=np.inf, shape=(self.buffer_length, 2),dtype=np.float32),
            'ProcessorSpeed':Box(low=np.array([0.]), high=np.array([np.inf]),dtype=np.float32),
        })

        self.observation_space= flatten_space(self.observation_space_dict)

and this is how I update it:

        obs_dict = dict({
            'action_mask': self.action_mask,
            'avail_actions': self.action_assignments,
            'Online_Buffer': np.array(self.online_buffer),
            'C_jobs': np.array(self.workbuffer[:, 3]).flatten(),            ## Criticality Column
            'RemLaxity_jobs': np.array(self.workbuffer[:, 4:6]),            ## Remaining time and Adjusted Priority
            'ProcessorSpeed': np.array([self.speed]).flatten()
        })
        obs_out = flatten(self.observation_space_dict,obs_dict)

I have been debugging this issue for hours and I can’t seem to get to the core of the problem. It keeps reading my Obs space in a weird way and I need to evaluate my models to continue with my research. Would be grateful to any help or nudge to the right direction.

omarelseadawy · November 11, 2022, 9:43pm

I also tried to use Tuple instead of Dict below:

        self.observation_space_dict = Tuple(
            (
            Box(0, 1, shape=(self.buffer_length,)),
            Box(-np.inf, np.inf, shape=(self.buffer_length,)),
            Box(low=-2, high=np.inf ,shape=(self.buffer_length,)),
            Box(low=0,high=1,shape=(self.buffer_length,)),
            Box(low=-np.inf, high=np.inf, shape=(self.buffer_length, 2)),
            Box(low=np.array([0.]), high=np.array([np.inf]))
            )
        )

but I still get the exact same error, I really can’t seem to find where is the first structure that gives shape (?,61) or how to adjust it or reshape it.

mannyv · November 11, 2022, 10:12pm

Hi @omarelseadawy,

The? dimension is the batch dimension. I think it should have worked without it but as a quick check what happens if you try

action=policy. compute_actions(np. expand_dims(state, 0))

omarelseadawy · November 12, 2022, 7:40pm

Hi @mannyv,

Thank you for your reply.
I was slightly confused between compute_action (which was deprecated to compute_single_action) and compute_actions and their functionalities. I readjusted my evaluation framework to make use of compute_single_action instead, and it worked out fine.
Sorry I saw your solution late, and I’m unsure if it will solve the issue. Yet thank you for your help

Topic		Replies	Views
Undestanding the expected output shapes of a Recurrent model with Dict Action Space Configure Algorithm, Training, Evaluation, Scaling	2	294	January 15, 2024
Handling spaces.Dict in Multi-Agent Environment without .shape Attribute Error RLlib	0	213	May 5, 2024
Custom environment observation formatting ValueError RLlib	1	1115	March 12, 2021
What is the proper way to deal with varying observation space? RLlib	7	1511	April 20, 2021
Dict observation space flattened RLlib	5	2467	January 25, 2021

Observation Space Issue with weird (?,val) shape

Related topics