Unpack_obs doesn't know to expect 1-hot

Aceticia · September 27, 2021, 3:48am

Hi, I’m not sure if this is a bug or am I missing something. I have an environment with this following obs space:

spaces.Tuple((
     spaces.Box(low=0, high=1, shape=(1+n_distractors, 3, 32, 32)),
     spaces.MultiDiscrete([n_vocab]*max_len),
     spaces.Discrete(2+n_distractors),
))

where n_distractors=1, n_vocab=10, max_len=20. When I try to run my env with PPO trainer, I get this error:

ValueError: Expected flattened obs shape of [..., 6347], got torch.Size([32, 6165])

I did some digging around and found that the error is because when unpacking the observation, the unpack_obs function doesn’t know to expect one-hot vectors. It expects a size of 6156, where 2*3*32*32+20+1=6156. However, the dummy batch generated was 6347, which =2*3*32*32+200+3, which was a one-hot version of the observation.

I wonder how to choose the expected format of input in the unpack function? I tried out both preprocessor_pref, but neither solved this problem.

sven1977 · September 27, 2021, 7:12am

Hey @Aceticia , thanks for this question. This is probably caused by a bug in our Preprocessors, which don’t handle the one-hot case properly in a Tuple space.
We are currently in the process of getting rid of preprocessors entirely (they just seem to cause confusion as they change the observations w/o the user being in control) and you can now do this in the latest master:
_disable_preprocessor_api=True. I just tried it on your example (confirmed the error first) and this setting made the error go away and I was able to run a training iteration with PPO.

If you use the above flag and if you don’t specify a custom model, RLlib will use the “ComplexInputNetwork” by default as you have a Tuple observation space. This model will properly one-hot your Discrete and MultiDiscrete inputs before concating and passing everything though n Dense layers, e.g.:

model:
    fcnet_hiddens: [128, 128]  # <- hidden layer number and sizes

Use your own custom model via:

model:
    custom_model: [SomeModelV2 sub-class]
    custom_model_config: {
        # some c'tor args for your custom model class
    }

sven1977 · September 27, 2021, 7:13am

I’ll still try to fix the Preprocessor bug now. Thanks for raising this!

sven1977 · September 27, 2021, 10:05am

Here is the fix PR. I’m sorry, we seem to have commented out an important test case some time ago due to it timing out. This is why we didn’t catch this regression sooner.
I confirmed that you can run your above example already with the current master, though, using _disable_preprocessor_api=True.

github.com/ray-project/ray

[RLlib] Discussion 3644: Fix bug for complex obs spaces containing `Box([2D shape])` and discrete component.

ray-project:master ← sven1977:discussion_3644_action_placeholder_multidiscrete

opened 10:03AM - 27 Sep 21 UTC

sven1977

+17 -12

Discussion 3644: Fix bug for complex obs spaces containing `Box([2D shape])` and… discrete component. See also this discussion here: https://discuss.ray.io/t/unpack-obs-doesnt-know-to-expect-1-hot/3644 Reinstate `test_supported_spaces_pg` (was commented out). ## Why are these changes needed? ## Related issue number ## Checks - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :(

Aceticia · September 27, 2021, 3:30pm

I’m glad my question can help you find the bug! One last question: I’m running a multi-agent game with this env, and I’d like all 2 agents to the same class of policy and custom model, but not share weights. Is the following setup correct? I’m wondering whether I can specify models in the policies item under multiagent , since in the future I might have slightly different models for these two agents.

ModelCatalog.register_custom_model("comm", xxx)
{
    "multiagent": {
        "policies": {"agent1": PolicySpec(), "agent2": PolicySpec()},
        "policy_mapping_fn": (lambda agent_id, **kwargs: agent_id)
    },
    "model": {"custom_model": "comm", "custom_model_config": {xxx}},
}

sven1977 · September 27, 2021, 3:49pm

Hey @Aceticia , yeah, this looks like it would do the job. Two different policies (same class and config). Mapping fn looks good.
I’m assuming your agents have the same name as your policies (“agent1” and “agent2”), which is totally fine!

The model config will be applied to both policies. You can give each of the policies their own “config overrides” to change them slightly and make them slightly different, if you wanted that. E.g.:

multiagent:
        policies:
          agent1 : PolicySpec(config={"lr": 0.005})
          agent2: PolicySpec(config={"lr": 0.000001})
...

Topic		Replies	Views
How to disable auto-encoding? RLlib	3	468	May 26, 2021
Model Custom Error RLlib	2	255	July 25, 2021
Preprocessor error on batches of observations RLlib	4	708	February 7, 2023
How to disable flattened Dict or Tuple observation in ComplexInputNetwork RLlib	5	535	July 9, 2021
Observation space is shrinking while changing from PPO to APPO RLlib	0	5	January 10, 2025

Unpack_obs doesn't know to expect 1-hot

Related topics