I was browsing the documentation for creating environments in RLlib
class MyEnv(gym.Env):
def init(self, env_config):
self.action_space = <gym.Space>
self.observation_space = <gym.Space>
def reset(self):
return
def step(self, action):
return , <reward: float>, <done: bool>, <info: dict>
Should the step function return the done signal or the terminated and truncated signals?