Scripted Agent Support

jsuarez5341 · June 4, 2021, 4:56pm

Scripted agents are essential baselines for many classes of environments. The only example I’ve seen in RLlib is the toy rock-paper-scissors model that takes the same action every time. I’m having difficulty figuring out how to implement more complex scripted models with full access to observations but without having to convert actions into logits. Is there any documentation or support for this?

Joseph

jsuarez5341 · June 9, 2021, 8:03pm

Bump – simple version: How do I submit actions instead of logits?

Sertingolix · June 10, 2021, 1:57pm

For prototyping I used a custom model, where I hardcoded my policy. From your question I assume a categorical action distribution. Without having to change the distribution, the sampling or other parts of your training you could create a different custom model and submit actions torch.tensor([[0,1000]],requires_grad=True) for a action of 1. For torch requires_grad=True assures, that the optimizer thinks it can optimize something. It’s avoiding the question a bit, because I think this is simpler than changing the rest of the pipeline.

Topic		Replies	Views
TorchMultiCategorical with logits calculated in the constructor RLlib	6	478	October 6, 2021
Non acting agents in APPO RLlib	2	262	January 26, 2022
Output from custom policy network for PPO RLlib	1	435	November 15, 2022
Custom torch model for PPO with discrete actions RLlib	1	166	May 8, 2024
Rllib trainig step customize RLlib	6	543	March 31, 2021

Scripted Agent Support

Related topics