Using a state embedding with PPO

lucasalavapena · May 16, 2022, 8:25pm

I have a GNN component in my custom model that creates an embedding for my state. I want to train an RL algorithm (ideally PPO) on the embedding rather than the state. I read a paper that was using embeddings as a state and using PG. I think it is not easily possible to directly define such a state when using gym.Env and have the GNN trained in an end to end fashion.

I see that the current ray RLlib dev version seems to have changed the training part of the PPO agent which might make it easier to modify, but guidance would be greatly appreciated.

gjoliver · May 16, 2022, 9:48pm

Hey thanks for the question.
I think it basically comes down to encoding your state somehow in the observation array (into a numpy array), and then unpack it into pytorch geometric format in the policy.
and then use the unpacked data to run and train the model.
@smorad1 has done this before.

lucasalavapena · May 20, 2022, 9:52am

Thanks for your answer.

Any idea how to encode the state or access the same model in the gym env? You need to somehow have access to the same model in the policy and then I assume call it using with no_grad. I think it might be easier to just implement it from scratch without using RLlib.

I have seen @smorad1 's graph-conv-memory-paper repo linked here before, but looking at it somewhat quickly I assume this is not the time they did that, right?

Topic		Replies	Views
Using a feature-extractor neural network to encode observations in a batch manner RLlib	3	397	December 16, 2022
How to deploy a trained Ray RLlib PPO policy/model in multi-agent-case? RLlib	5	820	November 10, 2021
How to use state embedding in RLlib RLlib	0	308	November 3, 2021
How to use my pretrained model as policy and value netwok RLlib	6	1188	December 26, 2023
Questions and Confusion: Getting started with RLlib Configure Algorithm, Training, Evaluation, Scaling	0	45	February 19, 2025

Using a state embedding with PPO

Related topics