When using an RLlib trainer, I noticed that a function handle is provided to
reset_config however, this function is simply a
This says to me that we, the user, are meant to inherit from Trainable and then override that class. However, if we have a remote class already that contains a trainer object, what is the correct way to handle the overriding?
@ray.remote class remote_trainer: def __init__(self): self.trainer = PPOTrainer(ppoConfig, env_name) ... def train(self): result = self.trainer.train() self.logger.log(result) return result def evaluate(self): #run rollout with trainer return rollout_statistics def reset(new_config): self.trainer.reset_config(new_config) # this will return False.
If you were only using one algorithm, then I could see just making my remote-trainer class inherent from e.g. PPOTrainer. But if we want that to be optimization-algorithm agnostic, then where should I override the reset function? The hacky solution I can think of is just to all-out replace the function e.g.
@ray.remote class remote_trainer: def __init__(self, update_config_function): self.trainer = PPOTrainer(ppoConfig, env_name) self.trainer.reset_config = update_config_function ...
But that would break things since we lose the self ref that way.
I am at a loss as to what class I should be overriding here since, in my application, I want the user to be able to select from any of the various RLlibTrainers.