Searching Across Environment Configurations

Shaun_Fattig · June 3, 2022, 3:43pm

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

What is the best way to search different environment configurations using Tune?
I found the env_config key, val that can be passed in with the config option of tune.run(), where a tune.<search> algorithm can be used to search across various settings. However, I’m training a multi-agent environment where the policies need to have the observation and action space given to it before tune.run() is executed. Currently I’m achieving this by creating a sample environment just before tune.run() and extracting the sample_env.observation_space and sample_env.action_space.
Unless some input validation or something is happening, I’m assuming the policies only need to know about the shape of the observations/actions. However, if the shape changes with each environment configuration, there is a mismatch between policy and environment and an error is thrown.

What is the proper way to handle this situation?

amogkam · June 3, 2022, 8:50pm

Hey @Shaun_Fattig! Just to clarify are you using RLlib here?

Shaun_Fattig · June 6, 2022, 2:25pm

@amogkam Correct, using RLlib

amogkam · June 6, 2022, 9:22pm

Got it thanks. cc @gjoliver could take a look?

gjoliver · June 6, 2022, 11:31pm

sorry I don’t get why you need to create your env before Tune runs.
can you register an env creation function for your env that takes in env_cfg and creates an corresponding env?
E.g.,

Shaun_Fattig · June 7, 2022, 6:42pm

@gjoliver

I’m passing the policies into the ‘multiagent’ key of the Tune config parameter, and each policy requires observations and actions to be specified beforehand.

github.com

ray-project/ray/blob/ray-1.6.0/rllib/agents/trainer.py#L430-L463

      
        
            "multiagent": {
                # Map of type MultiAgentPolicyConfigDict from policy ids to tuples
                # of (policy_cls, obs_space, act_space, config). This defines the
                # observation and action spaces of the policies and any extra config.
                "policies": {},
                # Keep this many policies in the "policy_map" (before writing
                # least-recently used ones to disk/S3).
                "policy_map_capacity": 100,
                # Where to store overflowing (least-recently used) policies?
                # Could be a directory (str) or an S3 location. None for using
                # the default output dir.
                "policy_map_cache": None,
                # Function mapping agent ids to policy ids.
                "policy_mapping_fn": None,
                # Optional list of policies to train, or None for all policies.
                "policies_to_train": None,
                # Optional function that can be used to enhance the local agent
                # observations to include more state.
                # See rllib/evaluation/observation_function.py for more info.
                "observation_fn": None,

This file has been truncated. show original

ma_config = {
    ...,
    policies = {
        'policy1': (policy_cls1, obs_space1, act_space1, config1),
        'policy2': (policy_cls2, obs_space2, act_space2, config2)
    }
}

tune.run(
    ...,
    config = {
        ...,
        'multiagent': ma_config
    }
)

I see in that example environment that PolicySpec is being used to initialize the policies. I’m not familiar with this class, so that may hold keys to my issues.

By the way, I’m currently using Ray 1.6

gjoliver · June 8, 2022, 8:47am

Any reason you are using a 6 month old version?

I think you can just specify None for the spaces, which RLlib would simply use the Env to determine space types:

github.com

ray-project/ray/blob/ray-1.6.0/rllib/policy/policy.py#L47-L52

      
        
            # If None, use the env's observation space. If None and there is no Env
            # (e.g. offline RL), an error is thrown.
            "observation_space",
            # If None, use the env's action space. If None and there is no Env
            # (e.g. offline RL), an error is thrown.
            "action_space",

Hope this helps.

Topic		Replies	Views
Multi-agent configuration incompatible with Ray hyperparam tuning RLlib	3	691	May 11, 2022
How to vary observation space in multi-agent training using tune.run() RLlib	2	325	May 4, 2021
MultiAgents type actions/observation space defined in environement RLlib	8	1381	May 10, 2022
Trying to set up external RL environment and having trouble RLlib	14	1431	September 28, 2021
[rllib] Modify multi agent env reward mid training RLlib	7	1318	May 27, 2021

Searching Across Environment Configurations

Related topics