How do I get Policy specific parameters?

kane0058 · May 12, 2021, 7:28pm

Let’s say that I’m building my own Policy similar to AlwaysSameHeuristic from the rock_paper_scissors_multiagent.py, but instead of randomly choosing an action, I want to pass in the action as a parameter. So we change the config to:

config[“multiagent”][“policies”][“always_same”] = (AlwaysSameHeuristic, Discrete(3), Discrete(3), {“deterministic_action”: 0})

How does AlwaysSameHeuristic access “deterministic_action”? Policy.__init__() is given a TrainerConfigDict which means we could access the AlwaysSameHeuristic’s parameters by self.config[“multiagent”][“policies”][“always_same”], but if I made two policies “always_same1” and “always_same2” which have different deterministic actions passed in as parameters, how does the policy know whether to select self.config[“multiagent”][“policies”][“always_same1”] or self.config[“multiagent”][“policies”][“always_same2”]?

sven1977 · May 13, 2021, 4:14pm

Hey @kane0058: If you have this config:

multiagent:
    policies:
        always_same: (None, [obs-space], [action-space], {"deterministic_action": 0})
        always_same2: (None, [obs-space], [action-space], {"deterministic_action": 1})

Then your Trainer will have two different policies, each of which will receive a config parameter in its c’tor, which contains the deterministic_action key (with different values (0 and 1) for the two policies).

Does this make sense?

The policy that’s picked is determined by the policy_mapping_fn function that you also specify in your “multiagent” config. It’s a function that takes an agent string (coming from the env) and maps it to one of these 2 policies (policy IDs).

kane0058 · May 13, 2021, 10:00pm

Yes it does. I can’t believe I missed that. Thank you for the help.

sven1977 · May 14, 2021, 10:01am

Cool! Glad to help!

Topic		Replies	Views
Two different method mapping policy to agents RLlib	1	287	February 2, 2023
Setting up multiagent config dict with different algorithm parameters RLlib	2	277	December 16, 2022
Failing at configuring a multi-agent trainer RLlib	0	43	December 20, 2024
How to provide tune with fixed Policy? RLlib	3	497	August 20, 2021
Actions created by Policy being modified before input to environment RLlib	4	290	March 15, 2023

How do I get Policy specific parameters?

Related topics