Error in AlphaZero algorithm: The actor died because of an error raised in its creation task

MetallicaSPA · May 15, 2023, 11:42pm

I’m trying to run the AlphaZero examples shown here: https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#alphazero and when I try to execute them I get the following error:

runfile('/home/joaquin/TFM/Doom_RL/RLlib_test.py', wdir='/home/joaquin/TFM/Doom_RL')
True
2023-05-16 01:35:36,559	INFO worker.py:1616 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
2023-05-16 01:35:40,942	INFO tune.py:218 -- Initializing Ray automatically. For cluster usage or custom Ray initialization, call `ray.init(...)` before `Tuner(...)`.
<IPython.core.display.HTML object>
(raylet) E0516 01:35:40.985175255  199150 fork_posix.cc:76]           Other threads are currently calling into gRPC, skipping fork() handlers
(AlphaZero pid=199655) 2023-05-16 01:35:44,420	WARNING algorithm_config.py:635 -- Cannot create AlphaZeroConfig from given `config_dict`! Property __stdout_file__ not supported.
(AlphaZero pid=199655) 2023-05-16 01:35:44,667	INFO algorithm.py:527 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(AlphaZero pid=199655) 2023-05-16 01:35:49,448	ERROR actor_manager.py:507 -- Ray error, taking actor 1 out of service. The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199798, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f0da1f57d30>)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 738, in __init__
(AlphaZero pid=199655)     self._update_policy_map(policy_dict=self.policy_dict)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1985, in _update_policy_map
(AlphaZero pid=199655)     self._build_policy_map(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 2097, in _build_policy_map
(AlphaZero pid=199655)     new_policy = create_policy_for_framework(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/policy.py", line 142, in create_policy_for_framework
(AlphaZero pid=199655)     return policy_class(observation_space, action_space, merged_config)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 323, in __init__
(AlphaZero pid=199655)     super().__init__(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero_policy.py", line 38, in __init__
(AlphaZero pid=199655)     self.env = self.env_creator()
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 313, in _env_creator
(AlphaZero pid=199655)     return env_cls(config["env_config"])
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 43, in __init__
(AlphaZero pid=199655)     self._initialize_buffer(r2_config["num_init_rewards"])
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 51, in _initialize_buffer
(AlphaZero pid=199655)     mask = obs["action_mask"]
(AlphaZero pid=199655) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199655) 2023-05-16 01:35:49,449	ERROR actor_manager.py:507 -- Ray error, taking actor 2 out of service. The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199799, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7fcd0d1a0df0>)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 738, in __init__
(AlphaZero pid=199655)     self._update_policy_map(policy_dict=self.policy_dict)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1985, in _update_policy_map
(AlphaZero pid=199655)     self._build_policy_map(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 2097, in _build_policy_map
(AlphaZero pid=199655)     new_policy = create_policy_for_framework(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/policy.py", line 142, in create_policy_for_framework
(AlphaZero pid=199655)     return policy_class(observation_space, action_space, merged_config)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 323, in __init__
(AlphaZero pid=199655)     super().__init__(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero_policy.py", line 38, in __init__
(AlphaZero pid=199655)     self.env = self.env_creator()
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 313, in _env_creator
(AlphaZero pid=199655)     return env_cls(config["env_config"])
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 43, in __init__
(AlphaZero pid=199655)     self._initialize_buffer(r2_config["num_init_rewards"])
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 51, in _initialize_buffer
(AlphaZero pid=199655)     mask = obs["action_mask"]
(AlphaZero pid=199655) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199655) 2023-05-16 01:35:49,451	ERROR worker.py:844 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::AlphaZero.__init__() (pid=199655, ip=192.168.18.9, repr=AlphaZero)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 242, in _setup
(AlphaZero pid=199655)     self.add_workers(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 635, in add_workers
(AlphaZero pid=199655)     raise result.get()
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/actor_manager.py", line 488, in __fetch_result
(AlphaZero pid=199655)     result = ray.get(r)
(AlphaZero pid=199655) ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199798, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f0da1f57d30>)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 738, in __init__
(AlphaZero pid=199655)     self._update_policy_map(policy_dict=self.policy_dict)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1985, in _update_policy_map
(AlphaZero pid=199655)     self._build_policy_map(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 2097, in _build_policy_map
(AlphaZero pid=199655)     new_policy = create_policy_for_framework(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/policy.py", line 142, in create_policy_for_framework
(AlphaZero pid=199655)     return policy_class(observation_space, action_space, merged_config)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 323, in __init__
(AlphaZero pid=199655)     super().__init__(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero_policy.py", line 38, in __init__
(AlphaZero pid=199655)     self.env = self.env_creator()
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 313, in _env_creator
(AlphaZero pid=199655)     return env_cls(config["env_config"])
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 43, in __init__
(AlphaZero pid=199655)     self._initialize_buffer(r2_config["num_init_rewards"])
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 51, in _initialize_buffer
(AlphaZero pid=199655)     mask = obs["action_mask"]
(AlphaZero pid=199655) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199655) 
(AlphaZero pid=199655) During handling of the above exception, another exception occurred:
(AlphaZero pid=199655) 
(AlphaZero pid=199655) ray::AlphaZero.__init__() (pid=199655, ip=192.168.18.9, repr=AlphaZero)
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 466, in __init__
(AlphaZero pid=199655)     super().__init__(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/tune/trainable/trainable.py", line 169, in __init__
(AlphaZero pid=199655)     self.setup(copy.deepcopy(self.config))
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 592, in setup
(AlphaZero pid=199655)     self.workers = WorkerSet(
(AlphaZero pid=199655)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 194, in __init__
(AlphaZero pid=199655)     raise e.args[0].args[2]
(AlphaZero pid=199655) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(RolloutWorker pid=199798) 2023-05-16 01:35:49,424	ERROR worker.py:844 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199798, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f0da1f57d30>)
(AlphaZero pid=199916) 2023-05-16 01:35:53,752	WARNING algorithm_config.py:635 -- Cannot create AlphaZeroConfig from given `config_dict`! Property __stdout_file__ not supported.
(AlphaZero pid=199916) 2023-05-16 01:35:54,004	INFO algorithm.py:527 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
2023-05-16 01:35:58,712	ERROR trial_runner.py:1450 -- Trial AlphaZero_CartPole-v1_34201_00001: Error happened when processing _ExecutorEventType.TRAINING_RESULT.
ray.tune.error._TuneNoNextExecutorEventError: Traceback (most recent call last):
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/tune/execution/ray_trial_executor.py", line 1231, in get_next_executor_event
    future_result = ray.get(ready_future)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/_private/worker.py", line 2523, in get
    raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::AlphaZero.__init__() (pid=199916, ip=192.168.18.9, repr=AlphaZero)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 242, in _setup
    self.add_workers(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 635, in add_workers
    raise result.get()
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/actor_manager.py", line 488, in __fetch_result
    result = ray.get(r)
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199970, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f988c1c4c40>)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 738, in __init__
    self._update_policy_map(policy_dict=self.policy_dict)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1985, in _update_policy_map
    self._build_policy_map(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 2097, in _build_policy_map
    new_policy = create_policy_for_framework(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/policy.py", line 142, in create_policy_for_framework
    return policy_class(observation_space, action_space, merged_config)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 323, in __init__
    super().__init__(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero_policy.py", line 38, in __init__
    self.env = self.env_creator()
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 313, in _env_creator
    return env_cls(config["env_config"])
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 43, in __init__
    self._initialize_buffer(r2_config["num_init_rewards"])
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 51, in _initialize_buffer
    mask = obs["action_mask"]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

During handling of the above exception, another exception occurred:

ray::AlphaZero.__init__() (pid=199916, ip=192.168.18.9, repr=AlphaZero)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 466, in __init__
    super().__init__(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/tune/trainable/trainable.py", line 169, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 592, in setup
    self.workers = WorkerSet(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 194, in __init__
    raise e.args[0].args[2]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

<IPython.core.display.HTML object>
(AlphaZero pid=199916) 2023-05-16 01:35:58,696	ERROR actor_manager.py:507 -- Ray error, taking actor 2 out of service. The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199971, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7fccc9d5ea60>) [repeated 2x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/ray-logging.html#log-deduplication for more options.)
(AlphaZero pid=199916)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 194, in __init__ [repeated 23x across cluster]
(AlphaZero pid=199916)     self._update_policy_map(policy_dict=self.policy_dict) [repeated 5x across cluster]
(AlphaZero pid=199916)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1985, in _update_policy_map [repeated 5x across cluster]
(AlphaZero pid=199916)     self._build_policy_map( [repeated 5x across cluster]
(AlphaZero pid=199916)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 2097, in _build_policy_map [repeated 5x across cluster]
(AlphaZero pid=199916)     new_policy = create_policy_for_framework( [repeated 5x across cluster]
(AlphaZero pid=199916)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/policy.py", line 142, in create_policy_for_framework [repeated 5x across cluster]
(AlphaZero pid=199916)     return policy_class(observation_space, action_space, merged_config) [repeated 5x across cluster]
(AlphaZero pid=199916)     super().__init__( [repeated 6x across cluster]
(AlphaZero pid=199916)     self.env = self.env_creator() [repeated 5x across cluster]
(AlphaZero pid=199916)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 313, in _env_creator [repeated 5x across cluster]
(AlphaZero pid=199916)     return env_cls(config["env_config"]) [repeated 5x across cluster]
(AlphaZero pid=199916)  [repeated 7x across cluster]
(AlphaZero pid=199916)   File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 51, in _initialize_buffer [repeated 5x across cluster]
(AlphaZero pid=199916)     mask = obs["action_mask"] [repeated 5x across cluster]
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices [repeated 6x across cluster]
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(AlphaZero pid=199916) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(RolloutWorker pid=199799) IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
(RolloutWorker pid=199970) 2023-05-16 01:35:58,656	ERROR worker.py:844 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199970, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f988c1c4c40>)
2023-05-16 01:35:58,725	ERROR trial_runner.py:1450 -- Trial AlphaZero_CartPole-v1_34201_00000: Error happened when processing _ExecutorEventType.TRAINING_RESULT.
ray.tune.error._TuneNoNextExecutorEventError: Traceback (most recent call last):
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/tune/execution/ray_trial_executor.py", line 1231, in get_next_executor_event
    future_result = ray.get(ready_future)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/_private/worker.py", line 2523, in get
    raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::AlphaZero.__init__() (pid=199655, ip=192.168.18.9, repr=AlphaZero)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 242, in _setup
    self.add_workers(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 635, in add_workers
    raise result.get()
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/actor_manager.py", line 488, in __fetch_result
    result = ray.get(r)
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199798, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f0da1f57d30>)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 738, in __init__
    self._update_policy_map(policy_dict=self.policy_dict)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1985, in _update_policy_map
    self._build_policy_map(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 2097, in _build_policy_map
    new_policy = create_policy_for_framework(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/policy.py", line 142, in create_policy_for_framework
    return policy_class(observation_space, action_space, merged_config)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 323, in __init__
    super().__init__(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero_policy.py", line 38, in __init__
    self.env = self.env_creator()
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 313, in _env_creator
    return env_cls(config["env_config"])
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 43, in __init__
    self._initialize_buffer(r2_config["num_init_rewards"])
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 51, in _initialize_buffer
    mask = obs["action_mask"]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

During handling of the above exception, another exception occurred:

ray::AlphaZero.__init__() (pid=199655, ip=192.168.18.9, repr=AlphaZero)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 466, in __init__
    super().__init__(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/tune/trainable/trainable.py", line 169, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 592, in setup
    self.workers = WorkerSet(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 194, in __init__
    raise e.args[0].args[2]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

2023-05-16 01:35:58,733	ERROR ray_trial_executor.py:883 -- An exception occurred when trying to stop the Ray actor:Traceback (most recent call last):
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/tune/execution/ray_trial_executor.py", line 874, in _resolve_stop_event
    ray.get(future, timeout=timeout)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/_private/worker.py", line 2523, in get
    raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::AlphaZero.__init__() (pid=199655, ip=192.168.18.9, repr=AlphaZero)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 242, in _setup
    self.add_workers(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 635, in add_workers
    raise result.get()
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/actor_manager.py", line 488, in __fetch_result
    result = ray.get(r)
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=199798, ip=192.168.18.9, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f0da1f57d30>)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 738, in __init__
    self._update_policy_map(policy_dict=self.policy_dict)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1985, in _update_policy_map
    self._build_policy_map(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 2097, in _build_policy_map
    new_policy = create_policy_for_framework(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/utils/policy.py", line 142, in create_policy_for_framework
    return policy_class(observation_space, action_space, merged_config)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 323, in __init__
    super().__init__(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero_policy.py", line 38, in __init__
    self.env = self.env_creator()
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/alpha_zero.py", line 313, in _env_creator
    return env_cls(config["env_config"])
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 43, in __init__
    self._initialize_buffer(r2_config["num_init_rewards"])
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py", line 51, in _initialize_buffer
    mask = obs["action_mask"]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

During handling of the above exception, another exception occurred:

ray::AlphaZero.__init__() (pid=199655, ip=192.168.18.9, repr=AlphaZero)
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 466, in __init__
    super().__init__(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/tune/trainable/trainable.py", line 169, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 592, in setup
    self.workers = WorkerSet(
  File "/home/joaquin/anaconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 194, in __init__
    raise e.args[0].args[2]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

2023-05-16 01:35:58,735	ERROR tune.py:941 -- Trials did not complete: [AlphaZero_CartPole-v1_34201_00000, AlphaZero_CartPole-v1_34201_00001]
2023-05-16 01:35:58,735	INFO tune.py:945 -- Total run time: 17.79 seconds (17.76 seconds for the tuning loop).

I’m using:

Ray 2.4.0 (Installed the default, air, tune, rllib and serve via pip following the instructions here without any errors
Gymnasium 0.28.1 (I need it for compatibility reasons with VizDoom 1.2.0)
Spyder IDE running in Anaconda, using Python 3.9

Things I have tried:

Using a different gymnasium version
Using gym library instead of gymnasium
Modifying the enviroment (importing gymnasium and creating the same enviroment with env.make
Using other algorithms; the PPO one seems to work fine.

What’s happening with the AlphaZero algorithm? I need it for my thesis. Thanks in advance.

Rohan138 · May 24, 2023, 2:24am

Hi, we’re currently upgrading to gymnasium 0.28.1 in this PR:[RLlib]: bump Gymnasium to 0.28.1 by Rohan138 · Pull Request #35698 · ray-project/ray · GitHub

Once that’s merged, could you try again using the Ray nightly wheel? Should be a few days, thanks in advance.

Topic		Replies	Views
Error while initiating alpha zero RLlib	1	395	June 20, 2023
Failure in running the example code from Ray document RLlib	2	1038	September 22, 2022
Error occuring repeatedly during training at random time steps RLlib	3	69	January 3, 2025
Error running intro code RLlib	2	148	March 21, 2024
Need some help with Actor died an error in its creation RLlib	5	569	April 23, 2024

Error in AlphaZero algorithm: The actor died because of an error raised in its creation task

Related topics