[rllib] Custom Evaluation: No actions column in the SampleBatch. How can we access actions in evaluation?

Hi everyone,

I’m trying to write a custom evaluation function. (By following the example codes in rllib/examples/custom_eval.py)

Firstly, I want to say that I’m very new to Ray and rllib. Therefore, if my question is meaningless or contextually wrong, I’m very sorry for taking your time.

I’m working on modifying the example code for custom evaluation rounds (lines 129-133 at ray/rllib/examples/custom_eval.py at ray-2.5.1 · ray-project/ray · GitHub).

My problem is, I want to access the actions of these evaluation batches, but as it can be seen below, SampleBatch does not have any key named ‘actions’.

for _ in range(5):
    eval_result_dict = eval_workers.local_worker().sample()
eval_sample_batch = eval_result_dict['default_policy']
print(eval_sample_batch)

And the print output:

SampleBatch(21: ['obs', 'new_obs', 'rewards', 'terminateds', 'truncateds', 'infos', 'eps_id', 'unroll_id', 'agent_index', 't', 'vf_preds', 'advantages', 'value_targets'])

After some debugging and tracing the code, I saw that in env_runner_v2.py, the code says that the actions column will be populated by StateBufferConnector. Normally, my SampleBatch objects include actions column as well but I’m not able to access them in the evaluation code. I also tried different environments, but I guess it is due to the StateBufferConnector and evaluation mode.

To summarize, how can we access the actions column in a custom evaluation function?