Multi-Agent Transformer

Aidan_McLaughlin · September 14, 2022, 7:07pm

Status: None. Just asking a question out of curiosity.
The recent paper, Multi-Agent Reinforcement Learning is A Sequence Modeling Problem, introduced a novel Multi-Agent Transformer (MAT) algorithm for MARL. MAT is SOTA on several MARL benchmarks and seems well-fitted for distributed training. My questions are as follows:

Do the prolific RLLIB authors have plans to implement this algorithm (or similar ones), as they did with QMIX?
If not, how would one design a custom agent for this task? Can RAY Core/Clusters work with such transformer models?
Should I care about this implementation?

I hope this finds you all well. Best,

Aidan

arturn · September 14, 2022, 7:34pm

Hi @Aidan_McLaughlin ,

We have a DT implementation that was merged into master 28 days ago.
Other than that, afaik, there are no plans on our roadmap for the near future of implementing this paper. Skimming the paper, Ray Clusters should certainly be able to handle such a workload. Models with such large numbers of parameters are only limited by the nodes’ memory of your cluster.

And lastly: Yes, please care about an implementation. We welcome community contributions very much. Even if we have no plan of implementing this ourselves. If well executed and aligned with our APIs, we will certainly merge it. Depending on how much maintenance it needs, you will have to help us maintain it though. If you are interested, please open a PR early so we can give feedback from time to time.

Cheers

Aidan_McLaughlin · September 14, 2022, 9:18pm

Hey Arturn, thanks for the thoughtful response. Will the DT library work with MultiAgentEnv? Best,
Aidan

mannyv · September 16, 2022, 1:54am

Hi @arturn and @Aidan_McLaughlin,

There is a big conceptual difference between the DT and and MAT.

DT uses a transformer operates over consecutive timesteps in a trajectory.

MAT uses a transformer to operate over auto regressive agent actions within a time step. T
AFAIK the transformer is not used across timesteps.

arturn · September 18, 2022, 11:05pm

@mannyv Thanks for the concise explanation!

@Aidan_McLaughlin have you decided now whether you want to implement the paper yet?
MultiAgentEnv does not care for the kind of policy you use to produce actions.

Aidan_McLaughlin · September 21, 2022, 2:11pm

Thanks Arturn, my team and I are certainly considering it. We’ll keep you updated. Thanks again for the help. Best,
Aidan

Topic		Replies	Views
Decentralized multi agent reinforcement learning RLlib	4	107	November 2, 2024
BicNet / CommNet / MARL communication RLlib	2	1049	March 19, 2021
[RLlib] Why some algorithms do not suppport multiagent or discrete/continuous action space? RLlib	1	490	January 25, 2021
Sample batch configuration to contain multi agent data Configure Algorithm, Training, Evaluation, Scaling	0	298	July 17, 2023
Applying rllib to robotics problems RLlib	4	837	April 25, 2021

Multi-Agent Transformer

Related topics