Rllid with pettingzoo environment faild

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I want to try rllib with the pettingzoo environment which is a multi-agent environment.
I run the demo in the project here and it does not work.
The part of the output show below:

(RolloutWorker pid=31114) 2022-07-22 17:16:51,659       ERROR worker.py:451 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=31114, ip=223.193.8.206, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7fbf4e0318d0>)
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 634, in __init__
(RolloutWorker pid=31114)     seed=seed,
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1789, in _build_policy_map
(RolloutWorker pid=31114)     name, orig_cls, obs_space, act_space, conf, merged_conf
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/policy/policy_map.py", line 152, in create_policy
(RolloutWorker pid=31114)     self[policy_id] = class_(observation_space, action_space, merged_config)
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/agents/ppo/ppo_torch_policy.py", line 59, in __init__
(RolloutWorker pid=31114)     self._initialize_loss_from_dummy_batch()
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/policy/policy.py", line 905, in _initialize_loss_from_dummy_batch
(RolloutWorker pid=31114)     self._dummy_batch, explore=False
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/policy/torch_policy.py", line 336, in compute_actions_from_input_dict
(RolloutWorker pid=31114)     input_dict, state_batches, seq_lens, explore, timestep
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/utils/threading.py", line 21, in wrapper
(RolloutWorker pid=31114)     return func(self, *a, **k)
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/policy/torch_policy.py", line 997, in _compute_action_helper
(RolloutWorker pid=31114)     dist_inputs, state_out = self.model(input_dict, state_batches, seq_lens)
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/rllib/models/modelv2.py", line 259, in __call__
(RolloutWorker pid=31114)     res = self.forward(restored, state or [], seq_lens)
(RolloutWorker pid=31114)   File "rayc.py", line 31, in forward
(RolloutWorker pid=31114)     model_out = self.model(input_dict["obs"].permute(0, 3, 1, 2))
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
(RolloutWorker pid=31114)     return forward_call(*input, **kwargs)
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/torch/nn/modules/container.py", line 141, in forward
(RolloutWorker pid=31114)     input = module(input)
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
(RolloutWorker pid=31114)     return forward_call(*input, **kwargs)
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 446, in forward
(RolloutWorker pid=31114)     return self._conv_forward(input, self.weight, self.bias)
(RolloutWorker pid=31114)   File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward
(RolloutWorker pid=31114)     self.padding, self.dilation, self.groups)
(RolloutWorker pid=31114) RuntimeError: expected scalar type Byte but found Float
Traceback (most recent call last):
  File "rayc.py", line 121, in <module>
    "policy_mapping_fn": (lambda agent_id: policy_ids[0]),
  File "/home/xxx/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/ray/tune/tune.py", line 741, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_pistonball_v6_01662_00000])

How can I fix it or get an executable demo?