How severe does this issue affect your experience of using Ray?
- Low: It annoys or frustrates me for a moment.
Hey,
I’m considering using RLlib for Multi-Agent driving simulation (with independent or centralized critic PPO). However, I would like to preprocess the individual observations of each agent using a Graph Neural Network, which outputs per-agent observations that include relative information about other agents.
Now, I’m facing the question of where/how to implement this preprocessing network. If it’s implemented as part of a custom model/network (e.g. by overwriting the forward pass as in this example: Models, Preprocessors, and Action Distributions — Ray 2.21.0), then the preprocessing step would happen for each agent’s policy, i.e. multiple times per step. This would be inefficient because the graph nn only needs one forward pass per step to preprocess the observations of all agents.
The other way would be to use a custom preprocessor, which is deprecated according to the docs and probably doesn’t support neural nets. I saw that AgentConnectors (Connectors (Beta) — Ray 2.21.0) could help with my problem but couldn’t find any examples using neural nets for the observation transformation. Also, the question would be, if the preprocessing network could be updated/backpropagated during training with a solution like this.
Is there any other way I could do this with RLlib?
Thanks for any ideas in advance!