How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hi all,
first i set up default ppo implementation with custom env,did troubleshooting and it was working,then i have built a custom neural network which subclasses TorchRLModule and overrides only _forward method for now,i am using new api,and this is my ppo config:
config = (
PPOConfig()
.api_stack(
enable_env_runner_and_connector_v2=True,
enable_rl_module_and_learner=True,
)
.environment("custom-env-69-v0")
.framework("torch")
.env_runners(
num_env_runners=2,
num_envs_per_env_runner=1,
num_cpus_per_env_runner=2,
num_gpus_per_env_runner=0,
rollout_fragment_length="auto",
)
.learners(
num_learners=1,
num_gpus_per_learner=1,
)
.rl_module(
rl_module_spec=RLModuleSpec(
module_class=NeuralNetwork,
observation_space=self.env.observation_space,
action_space=self.env.action_space,
inference_only=False,
model_config={},
catalog_class=PPOCatalog,
)
)
.training(
lr=0.001,
gamma=0.99,
train_batch_size_per_learner=8,
num_epochs=20,
minibatch_size=2,
)
.evaluation(
evaluation_num_env_runners=1,
evaluation_duration=10,
evaluation_interval=2,
evaluation_duration_unit="episodes",
evaluation_parallel_to_training=True,
)
)
my problem is i’m not sure if i’m using RLModule,Catalog,Connectorv2 properly.
so i couldn’t fix this error:
ray.exceptions.RayTaskError(KeyError): ray::_WrappedExecutable.apply() (pid=24266, ip=192.168.0.17, actor_id=ab82a87e1b792b29a914c9d701000000, repr=<ray.train._internal.worker_group._WrappedExecutable object at 0x7636ca5ca1d0>)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/core/learner/learner.py”, line 1588, in apply
return func(self, *_args, **_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/core/learner/learner_group.py”, line 383, in _learner_update
result = _learner.update_from_episodes(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/core/learner/learner.py”, line 1073, in update_from_episodes
return self._update_from_batch_or_episodes(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/core/learner/learner.py”, line 1400, in _update_from_batch_or_episodes
fwd_out, loss_per_module, tensor_metrics = self._update(
^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/core/learner/torch/torch_learner.py”, line 495, in _update
return self._possibly_compiled_update(batch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/core/learner/torch/torch_learner.py”, line 146, in _uncompiled_update
loss_per_module = self.compute_losses(fwd_out=fwd_out, batch=batch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/core/learner/learner.py”, line 913, in compute_losses
loss = self.compute_loss_for_module(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/algorithms/ppo/torch/ppo_torch_learner.py”, line 85, in compute_loss_for_module
batch[Postprocessing.ADVANTAGES] * logp_ratio,
~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/alyy/PycharmProjects/RLProject/.venv/lib/python3.11/site-packages/ray/rllib/policy/sample_batch.py”, line 973, in getitem
value = dict.getitem(self, key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: ‘advantages’
so batch doesn’t have a key named advantages,i’m assuming that this is related to my implementation of algorithm/network.
I’m only customizing env and neural network for now