Compute Action with LSTM

I am using ray with normal layers and it works very well, but I cannot find in the documentation how to make predictions with LSTM layers.

action = trainer.compute_single_action(obs)

I always get this error:

    action = self.trainer.compute_single_action(obs)
  File "c:\Test\.env\lib\site-packages\ray\rllib\algorithms\", line 1140, in compute_single_action
    action, state, extra = policy.compute_single_action(
  File "c:\Test\.env\lib\site-packages\ray\rllib\policy\", line 327, in compute_single_action
    out = self.compute_actions_from_input_dict(
  File "c:\Test\.env\lib\site-packages\ray\rllib\policy\", line 483, in compute_actions_from_input_dict
    return self._compute_action_helper(
  File "c:\Test\.env\lib\site-packages\ray\rllib\utils\",
line 24, in wrapper
    return func(self, *a, **k)
  File "c:\Test\.env\lib\site-packages\ray\rllib\policy\", line 1016, in _compute_action_helper
    dist_inputs, state_out = self.model(input_dict, state_batches, seq_lens)
  File "c:\Test\.env\lib\site-packages\ray\rllib\models\", line 259, in __call__
    res = self.forward(restored, state or [], seq_lens)
  File "c:\Test\.env\lib\site-packages\ray\rllib\models\torch\", line 207, in forward
    assert seq_lens is not None
Hi @evo11x,

This code snippet should help I think.

it works, but now the returned action is a tuple of 3 lists instead of numpy array of 2 actions.

(array([-1. , 0...e=float32), [array([-0.50677115, ...e=float32), array([-0.55859864, ...e=float32)], {'action_dist_inputs': array([-0.30818474, ...e=float32), 'action_prob': 0.032917757, 'action_logp': -3.413743})

action[0] looks like my actions

but what are the other values from the action ?

Great that it works.
The outputs are: (action, new_state, extra_outputs)

