Ignore step in Environment

I am coupling RLlib with a simulation software. There is the problem, that the simulation sometimes does not converge. This leads to unplausible values (outputs and therefore wrong correlation between action and reward).

Is there a setting in rllib to ignore this calculation?

Thanks in advance

Hi @SebastianBo1995

RLLib does not know what you mean by unplausible values.
You can encapsulate your environment in your own custom environment class tho and do the work yourself. The following code is taken from the official website and modified a little bit.

import gym, ray
from ray.rllib.agents import ppo

class MyEnv(gym.Env):
    def __init__(self, env_config):
       self.env = SebastiansSimulation()
    def reset(self):
        obs = self.env.reset()
        # <your plausability tests and actions go here>
        return obs
    def step(self, action):
        obs, reward, done, info = self.env.step(action)
        # <your plausability tests and actions go here>
        return obs, reward, done, info

ray.init()
trainer = ppo.PPOTrainer(env=MyEnv, config={
    "env_config": {},  # config to pass to env class
})

Cheers
1 Like