Use encoder of other agents models in agent forward

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hello,

I am working on a multi-agent setting and I want to develop a custom model such that the forward of each agent uses a piece of the other agent models.

In my case, each agent (model) takes an input of a different size and shape. Therefore, I want each model to have a first FC block that encodes the input into a fixed-size vector. Then, I want the model of agent i to take as input the encoding of agent i and the encoding of some other agents (e.g. its neighbors).

This requires that in the forward pass for agent i I have access to the encoder of the other agents, and the rollout and training steps to still work. How can I achieve this?

Example of how I would implement it:


    def forward(self, input_dict, state, seq_lens):  
        agent_obs = input_dict['obs']['agent']
        agent_encoding = self.encoder(agent_obs)
        # Generate encodings for neighbor observations using their encoder model
        neighbor_encodings = []
        for neighbor in input_dict['obs']['neighbors']:
            encoder = get_encoder_for(neighbor)  # Get the encoder of the neighbor agent model
            neighbor_obs = input_dict['obs']['neighbors'][neighbor]
            neighbor_encodings.append(encoder(neighbor_obs))
        neighbors_encoding = aggregate(neighbor_encodings)  # e.g. sum, avg, max
        z = concat(agent_encoding, neighbors_encoding)
        # Use z to compute action, values, etc
2 Likes

Hi Federico! :wave:t3:

Welcome to the Ray community! Would you like to ask your question in RLlib Office Hours? It sounds like a good topic!

:writing_hand:t3: Just add discuss link to your question to this doc: RLlib Office Hours - Google Docs

Thanks! Hope to see you there!

1 Like

are these encoding layers trainable??
what does it do during backward pass if you use some other models’ encoding layer in the forward pass?
in other words, if an agent has 3 neighboring nodes, it will get encoded 3 times, and shows up 3 times in some other agents’ obs vectors, does this encoder gets trained 3 times during the backward pass?

It seems like you are trying to do GraphNN training, in which case, should you consider using things like pytorch geometric to train the model?

Yes, I want to do something similar to a GNN but simpler for now. But I have the issue that each agent has a different observation space and I, therefore, need a different encoder for each.