Change action space within episode

Rohit_Modee · December 28, 2021, 2:24pm

Hi,

I have a custom environment that outputs some property called as “p1” in the range 100 to 10^-4.
Algo - PPO
action space - box[+0.05, -0.05, 3]
episode = 10 steps
The aim of the agent is to get a state which gives p1 as close to 10^-4 as possible.
problem - Occurance of state where (property p1 < 0.1) is very rare (0.5 %). Hence the policy learned is sub-optimal .i.e. gets state where p1 = 0.1 and not 0.0001

I want the model to take larger actions i.e box(±0.05) for the first five steps and smaller actions like box(±0.001) for the last five steps.

Is it possible to do this.

Roller44 · December 28, 2021, 3:08pm

Form my opinoin:

Possible solution 1: manually shrink the action input in the step(self, action) method of your custom environment… For example,

if step > 5:
    action / <some constant>

But I don’t think it is a good idea.

Possible solution 2: change the entropy constant and the learning rate to encourage the explorations of the PPO agent.

robin · December 28, 2021, 7:29pm

One way of looking at your problem, is that with your action space as you defined it, the desired later actions box(±0.001) are a very tiny subset of the overall action space.

To address this, you might consider using a transform of your action space. Maybe something similar to log-modulus A log transformation of positive and negative values - The DO Loop (though you might need to change the formula a bit, say by multiplying your x by a large constant).

The right transform may make it much easier for the exploration to discover the good regions of the action space.

Topic		Replies	Views
Continuous actions go beyond defined action_space and then nan for multi-agent PPO RLlib	0	329	July 3, 2021
PPO Policy not respecting action-space bounds RLlib	0	44	June 27, 2024
Observation dependent continuous action space ("Masking" continuous action space) RLlib	4	1091	February 9, 2022
Prediction outside outside action space during inference	0	104	March 18, 2024
What is the proper way to deal with varying observation space? RLlib	7	1494	April 20, 2021

Change action space within episode

Related topics