I’m trying to train a single agent with multiple different environments simultaneously using a custom simulator via ExternalEnv. I know that with gym.Env
you can access env_config.worker_index
which allows you to implement this.
I’m wondering if there is something like this with ExternalEnv so that I can use the rollout worker indices to assign different environments respectively and run them together.
Note: Each rollout worker will have a different setting of the same simulator.
I think I found the answer to this question. Please let me know if I got the concept wrong, but at least the code is working as intended.
At ray/rllib-env.rst at master · ray-project/ray · GitHub, there’s an example of how to “wrap” your env_config
with EnvContext
so that you can access worker_index
and vector_index
. The following is the example code from the link:
from ray.tune.registry import register_env
def env_creator(env_config):
return MyCustomSimulator(env_config) # return an env instance
register_env("my_env", env_creator)
Honestly, I don’t really understand why having a function env_creator
would wrap env_config
with EnvContext.
My understanding is that there is some method in the registry that wraps it when a callable is inputted? (If you know the reason, please point me to the code.)
However, now you would be able to access worker indices with env_config.worker_index.
For example:
class MyCustomSimulator(ExternalEnv):
def __init__(self, env_config):
...
print(env_config.worker_index)
...
tune.run(
"DQN",
config={"env": "my_env",
"env_config": {...},
"num_workers": 6}
)
This will print out numbers from 1 to 6. But it won’t be in order. Anyway, the number is all I needed.
There’s one thing to caution, however: env_config
in your MyCustomSimulator
class is now an EnvConfigDict
type.
Perfect. Yes, the env_config is actually not only a dict, but an EnvContext
object (from ray.rllib.env.env_context import EnvContext).
It’s a (config) dict for the env, but also has the following properties:
self.worker_index = worker_index
self.num_workers = num_workers
self.vector_index = vector_index
self.remote = remote
Is there also a way to access trainer’s resp. rollout worker’s configs (e.g. discount factor gamma) in custom env?
Hey @klausk55 , sorry for the late response .
No, there actually is no way to access the Trainer’s config from inside your environment. It’s by design, I believe, to keep the environment as an independent entity that has no knowledge about where and by whom it’s being looped through.
just add worker_index to your env and ray will pass the index to it
def init(self, worker_index, config=None):