`RolloutWorker` does not properly initialize`policy_map`

Hi, I am unsure whether I am using the RolloutWorker class wrong, or if this is a bug. I want to create a remote RolloutWorker and later use it to gather rollouts. If I use as_remote(), the policy_map of the RolloutWorker is not created properly, i.e., throws AttributeError: 'ActorClass(RolloutWorker)' object has no attribute 'policy_map'.

The code below is a minimal example to reproduce the issue. Everything works fine if I do not use as_remote().remote. I am using Ray v1.10.0. Thanks!

import gym
from ray.rllib.evaluation import RolloutWorker
from ray.rllib.examples.policy.random_policy import RandomPolicy

worker = RolloutWorker.as_remote().remote(env_creator=lambda x: gym.make("MountainCarContinuous-v0"), policy_spec=RandomPolicy)
print(worker.policy_map)

Ah so the problem here is in how you are trying to get your policy map.

The worker is now a remote object, so you can’t directly get any of its attributes.

You can however remotely call any of its functions, and we wrote a function on the worker called apply(worker, fn, *args, **kwargs), for the exact use case that you have here.

so how you can use this function is by doing something like this:

worker = RolloutWorker.as_remote().remote(env_creator=lambda x: gym.make("MountainCarContinuous-v0"), policy_spec=RandomPolicy)
worker.apply.remote(lambda _worker: print(_worker.policy_map))

The reason I didn’t do something like this:
policy_map = ray.get(worker.apply.remote(lambda _worker: _worker.policy_map)), which is how I would normally retrieve attributes of a rollout worker,
is because the policy_map object has as threading.RLock() object stored under the attribute self._lock. This lock can’t be serialized by ray, so it, and the entire policy_map cannot be transferred between ray actors via ray.get calls. We require this lock for some of our algorithms.

Lemme know if you have any more questions.

And if you have any questions about ray remote actors (e.g. how to interact with them, and what operations are possible with them) you can refer to this this doc page.

2 Likes