Hello, I have a multiagent environment on which 2 agents bargain. It is like a"turn-based" environment.
Similar to this:
e.g. user1 action → env obs for user2 → user2 action → env obs for user1 → user1 action, etc…
In particular:
In one round, an agent propose a price for an asset, say $50.87
and the other agent replies with Sell/No Sell.
So the action space for agent 1 is continuous (i.e. “box”) as it is a price to offer.
while the action and observation space for the other agent is discrete {0,1} as it is accept/reject the price.
Q: I am not sure how to define the Action and State spaces in my MultiAgent environment so RLLIb can understand it. Since one agent proposes continuous numbers (the action is the number to propose) and the other agent sees these continuous prices and proposes discrete numbers.
In particular, I don’t know how to define this:
def __init__(self, config=None):
super().__init__()
action_space = Box(low=0.0, high=2.0, shape=(1,), dtype=np.float16)
observation_space = Box(low=-np.inf, high=np.inf, shape=(3,), dtype=np.float16)
Note that both agents will be trained with the same policy. The model learns which price to propose and which price to reject, at the same time. So 1 policy for both agents (for now).
Thanks!