How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
While trying to initiate alphazero on ray, it throws me error.
Code:
from ray.rllib.algorithms.alpha_zero import AlphaZeroConfig
config = AlphaZeroConfig()
config = config.training(sgd_minibatch_size=256)
config = config.resources(num_gpus=0)
config = config.rollouts(num_rollout_workers=4)
print(config.to_dict())
algo = config.build(env=“CartPole-v1”)
The code throws the following error:
~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py in init()
41 self.r2_buffer = RankedRewardsBuffer(max_buffer_length, percentile)
42 if r2_config[“initialize_buffer”]:
—> 43 self._initialize_buffer(r2_config[“num_init_rewards”])
44
45 def _initialize_buffer(self, num_init_rewards=100):
~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/ray/rllib/algorithms/alpha_zero/ranked_rewards.py in _initialize_buffer()
49 terminated = truncated = False
50 while not terminated and not truncated:
—> 51 mask = obs[“action_mask”]
52 probs = mask / mask.sum()
53 action = np.random.choice(np.arange(mask.shape[0]), p=probs)
IndexError: only integers, slices (:
), ellipsis (...
), numpy.newaxis (None
) and integer or boolean arrays are valid indices.
Tried changing the environments and use samples mentioned in documentation:
https://docs.ray.io/en/releases-2.5.0/rllib/rllib-algorithms.html#alphazero
but still experiencing above error.
Is this an internal issue? Is there any way to correct it?
Any leads will be greatly appreciated.