Error while initiating alpha zero

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

While trying to initiate alphazero on ray, it throws me error.

Code:
from ray.rllib.algorithms.alpha_zero import AlphaZeroConfig
config = AlphaZeroConfig()
config = config.training(sgd_minibatch_size=256)
config = config.resources(num_gpus=0)
config = config.rollouts(num_rollout_workers=4)
print(config.to_dict())

algo = config.build(env=“CartPole-v1”)

The code throws the following error:

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py in init()
41 self.r2_buffer = RankedRewardsBuffer(max_buffer_length, percentile)
42 if r2_config[“initialize_buffer”]:
—> 43 self._initialize_buffer(r2_config[“num_init_rewards”])
44
45 def _initialize_buffer(self, num_init_rewards=100):

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py in _initialize_buffer()
49 terminated = truncated = False
50 while not terminated and not truncated:
—> 51 mask = obs[“action_mask”]
52 probs = mask / mask.sum()
53 action = np.random.choice(np.arange(mask.shape[0]), p=probs)

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices.

Tried changing the environments and use samples mentioned in documentation:

https://docs.ray.io/en/releases-2.5.0/rllib/rllib-algorithms.html#alphazero

but still experiencing above error.

Is this an internal issue? Is there any way to correct it?

Any leads will be greatly appreciated.

Sorry to heart that, can you please post a full reproduction script as described in Mini forum guide/self-help guide? Cheers :slight_smile: