I am trying to implement an meta RL. For example a outer RL loop with a RL inner loop. How would one start with RLlib? I am asking for some ideas and guidance. Thanks!
Hey @RickLan , could you take a look at our MAML implementation (rllib/agents/maml/...
)? @michaelzhiluo already solved this problem of the different loops in there.
Thank you for the quick reply. I will take a look there!
mannyv
June 30, 2021, 3:41am
4
Hi @RickLan ,
Another place that might also make sense is the hierarchical training example. It essentially boils down to an outer and inner loop.
https://docs.ray.io/en/latest/rllib-env.html?highlight=multiagent#hierarchical-environments
"""Example of hierarchical training using the multi-agent API.
The example env is that of a "windy maze". The agent observes the current wind
direction and can either choose to stand still, or move in that direction.
You can try out the env directly with:
$ python hierarchical_training.py --flat
A simple hierarchical formulation involves a high-level agent that issues goals
(i.e., go north / south / east / west), and a low-level agent that executes
these goals over a number of time-steps. This can be implemented as a
multi-agent environment with a top-level agent and low-level agents spawned
for each higher-level action. The lower level agent is rewarded for moving
in the right direction.
You can try this formulation with:
$ python hierarchical_training.py # gets ~100 rew after ~100k timesteps
This file has been truncated. show original
Thanks, @mannyv ! Will look as well.