Add the experiences to the buffer "by hand"

carlorop · December 3, 2021, 10:21am

I am working using a RL code that implements algorithms via TensorForce, training by adding the experiences to the buffer by calling a method from the agent. whenever the buffer reaches the step size, it starts training.

A minimum example would be:

for i in range(timesteps):
                    action = agent.predict(state)
                    next_state, reward, done, _ = env.step(action)
                    agent.add_to_buffer(state, action, reward, next_state, done)

Adding the experiences by hand is necessary in this case due to the particular structure of the environment. I am aware that RLLIB is the state-of-the-art library for RL. I wanted to change the agents from the TensorForce implementation to RLLIB. My question is, is there any way to create an equivalent code using RLLIB? I am referring to passing the experience to the agent externally, instead of training under the hood.

arturn · December 5, 2021, 12:11pm

Hi @carlorop ,

@Lars_Simon_Zehnder has written a few lines on this before.
I am not aware of any “1 liner” solution for this in RLlib right right now.
Nevertheless, it is possible. Do you need help with coding it? This is a usecase that I am interested in and we can work on it together if you like.

Lars_Simon_Zehnder · December 6, 2021, 5:58pm

Please share the results, if you do!

carlorop · December 7, 2021, 3:25pm

Hi @arturn. Undoubtedly it would add a considerable value to RLLIB. However I will use other libraries in the meantime. If you manage to implement it I would be eternally thankful.

arturn · December 8, 2021, 2:26pm

Ok, I did not expect that. I will ask Sven whether there are plans on this and if he has any recommendations on how this should be approached.

felipeeeantunes · December 8, 2021, 2:46pm

I guess you can fill the replay buffer with an agent’s prediction using a strategy similar to the one employed to generate offline datasets:

[RLlib Offline Datasets — Ray v1.9.0](Example: Converting external experiences to batch format)

arturn · December 10, 2021, 11:51am

You can, of course, instantiate a ReplayBuffer and call add() to add experiences. Or use ReplayActors like in Ape-X. Or you can write an execution_plan that makes use of a ReplayBuffer.

I see two ways here:

If you plan to run your code on your own machine: Instantiate a ReplayBuffer and use it like in your example. In this case you can use an offline dataset to fill it, like @felipeeeantunes suggested.
If you plan to use RLlib and Ray to their full capacities, I suspect that you will not get around writing code that does not look as simple as your code snipped. Especially if you try to gradually mix in some experiences manually and not fill the buffer in the beginning.

sven1977 · December 14, 2021, 10:12am

Hey @carlorop , this is a very valid question! I guess it’s caused by the fact that in RLlib adding to and reading from the buffer is usually done under the hood via the execution plan.
You can indeed access the buffer via trainer.local_replay_buffer. Then call the buffer’s add_batch([some sample batch to add]) method

Topic		Replies	Views
Use only specific timesteps during agent training RLlib	3	506	June 21, 2022
Rllib trainig step customize RLlib	6	551	March 31, 2021
Offline data with self made dataset RLlib	1	266	June 7, 2023
Rllib replay buffer RLlib	1	183	April 14, 2023
Creating buffer for PPO RLlib	1	281	July 21, 2023

Add the experiences to the buffer "by hand"

Related topics