Step by step way to interact with an environment and update an agent

AvisekNaug · April 25, 2023, 10:23pm

I am trying to train two separate agents in two different environments that are to train in the same loop. The output from one env.step is to be fed to the next environment before I call the compute_single_action on the next environment. Currently, in Rllib, everything seems to be encapsulated behind a .train() method with very little opportunity for customization during training.

The crux of the problem is that I cannot find a good example in the current RLLib version that shows how to perform the following steps explicitly.

Set up an environment
Set up the RLLib PPO agent
In a for loop with iteration budget:

choose an action based on the current state of the environment
collect them in some form of a RLLib Buffer class if one exists
after some periodic steps of budget, perform the PPO agent update with the Buffer

Evaluate it at regular intervals.

Rohan138 · May 23, 2023, 12:06am

Sorry if I don’t understand your question correctly- does the hierarchical training example help for your training workflow? This involves one high-level agent computing actions that are then passed to low-level agents ray/hierarchical_training.py at master · ray-project/ray · GitHub

Or ray/centralized_critic.py at master · ray-project/ray · GitHub for sharing observations between agents.

To implement a custom training workflow, you’ll have to do it by overriding the algorithm’s training_step method: ray/algorithm.py at master · ray-project/ray · GitHub

Topic		Replies	Views
Can't understand training config Configure Algorithm, Training, Evaluation, Scaling	2	31	July 30, 2024
Asymmetric play multiagent environment RLlib	2	452	January 6, 2022
Help with ppo config in multiagent env with complex observations Configure Algorithm, Training, Evaluation, Scaling	0	16	April 11, 2025
How to deploy a trained Ray RLlib PPO policy/model in multi-agent-case? RLlib	5	806	November 10, 2021
Multi-Agent Training with Different Algorithms RLlib	24	3414	October 11, 2022

Step by step way to interact with an environment and update an agent

Related topics