Compute Action with LSTM

hermmanhender · May 16, 2024, 7:35am

Hi! I had the same issue and I solved it with the code exposed here, but a new error appeared and I don’t know how to fix it. The complete error message is:

Traceback (most recent call last):
  File "c:\Users\grhen\Documents\GitHub\eprllib_experiments\active_climatization\init_experiment\test_trained_OnOffHVAC.py", line 110, in <module>
    init_drl_evaluation(
  File "c:\Users\grhen\anaconda3\envs\eprllib1-1-1\lib\site-packages\eprllib\postprocess\marl_init_evaluation.py", line 88, in init_drl_evaluation
    action, state_out, _ = policy['shared_policy'].compute_single_action(obs=obs_dict[agent], state=state)
  File "c:\Users\grhen\anaconda3\envs\eprllib1-1-1\lib\site-packages\ray\rllib\policy\policy.py", line 552, in compute_single_action
    out = self.compute_actions_from_input_dict(
  File "c:\Users\grhen\anaconda3\envs\eprllib1-1-1\lib\site-packages\ray\rllib\policy\torch_policy_v2.py", line 557, in compute_actions_from_input_dict
    return self._compute_action_helper(
  File "c:\Users\grhen\anaconda3\envs\eprllib1-1-1\lib\site-packages\ray\rllib\utils\threading.py", line 24, in wrapper
    return func(self, *a, **k)
  File "c:\Users\grhen\anaconda3\envs\eprllib1-1-1\lib\site-packages\ray\rllib\policy\torch_policy_v2.py", line 1260, in _compute_action_helper
    dist_inputs, state_out = self.model(input_dict, state_batches, seq_lens)
  File "c:\Users\grhen\anaconda3\envs\eprllib1-1-1\lib\site-packages\ray\rllib\models\modelv2.py", line 255, in __call__
    res = self.forward(restored, state or [], seq_lens)
  File "c:\Users\grhen\anaconda3\envs\eprllib1-1-1\lib\site-packages\ray\rllib\models\torch\recurrent_net.py", line 247, in forward
    torch.reshape(input_dict[SampleBatch.PREV_REWARDS].float(), [-1, 1])
  File "c:\Users\grhen\anaconda3\envs\eprllib1-1-1\lib\site-packages\ray\rllib\policy\sample_batch.py", line 950, in __getitem__
    value = dict.__getitem__(self, key)
KeyError: 'prev_rewards'

Can you provide me with some help?
Thanks!
Germán

PS: I’m using ray version 2.20.0 on Windows 11

Topic		Replies	Views
Compute single action with LSTM RLlib	0	65	May 21, 2024
[Rllib] compute_single_action() with an LSTM-PPO trainer fails RLlib	1	983	February 3, 2023
LSTM with trainer.compute_single_action broken again RLlib	12	1052	May 17, 2022
LSTM wrapper giving issue when used with trainer.compute_single_action RLlib	9	972	April 25, 2022
Compute_single_action randomly errors without changing input RLlib	0	243	October 16, 2023

Compute Action with LSTM

Related topics