Is it possible to access Observations in get_exploration_action

BenS · May 24, 2023, 4:29pm

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.
High: It blocks me to complete my task.

Hello, we’ve an exploration algorithm we’d like to port to RLlib that, with some (time-annealed) probability, consults an offline policy for advice - i.e. we’d sample an action from the usual ActionDistribution produced the the RL policy for much of the time, but occasionally we would feed the observation to the advice policy and sample an action from there.

Being relatively new to RLlib, it seems that subclassing Exploration would be the way to do this, but it’s not clear to me whether I can actually access the observations inside Exploration.get_exploration_action that I need in order to sample from my advice policy.

Any advice appreciated!

BenS · May 24, 2023, 7:39pm

It looks like I could probably do this using ActionConnector – i.e. use the Algorithm’s standard exploration method to find an action, and then selectively override that action by sampling from my advice policy based upon the observation contained in the ActionConnectorDataType’s input_dict field.

Any reason that this would be a Bad Idea?

Topic		Replies	Views
Custom Exploration Behavior based on Observations RLlib	0	425	March 13, 2021
Switching exploration through action subspaces RLlib	10	719	November 11, 2022
Explorative action or not? RLlib	1	270	April 26, 2022
Observation dependent continuous action space ("Masking" continuous action space) RLlib	4	1120	February 9, 2022
[rllib] Retrieve and modify the computed discrete action logits to PPO agent RLlib	6	721	May 5, 2021

Is it possible to access Observations in get_exploration_action

Related topics