How prev_action is defined at env.reset() step?

suro_yoon · May 15, 2024, 5:18am

How severe does this issue affect your experience of using Ray?

None: Just asking a question out of curiosity

Can I define initial action at t = 0(obs = env.reset())?

I’m using “use_prev_action” and “use_prev_reward” in model config.
I want to know how prev_action and prev_reward defined at first action inference
(whether is it random or something)

I think the fastest solution is manually including init action and init reward in my custom env’s observation.

mannyv · May 15, 2024, 11:06am

Hi @suro_yoon,

A lot has changes in the library since I last checked but as far as I know they are still set to all zeros.

suro_yoon · May 16, 2024, 6:38am

@mannyv Thanks,
i think i have checked it too several years ago.
but not sure my memories.

That “set to all zeros” doesn’t mean action=0. am I right?
ex)
let’s say action space = Discrete(3),
then the tensor after one-hot is [0,0,0]. not [1,0,0]

Topic		Replies	Views
Trajectory View API Example RLlib	4	395	February 8, 2023
Algo.train() calls env.step() with empty action object RLlib	1	226	December 21, 2023
Inconsistent actions from Algorithm.compute_single_action RLlib	3	402	June 14, 2023
Action Masking Model: Deterministic selection of the best action RLlib	0	18	August 11, 2024
Questions and Confusion: Getting started with RLlib Configure Algorithm, Training, Evaluation, Scaling	0	41	February 19, 2025

How prev_action is defined at env.reset() step?

Related topics