What is the recommended way to make use of a trained model?


I’m training some models for a custom MuJoCo env that closely matches a robot I’ve built irl. I’m now looking at how to use that model outside of the training infrastructure(ideally).

I see that there’s compute_actions on the ray.rllib.agents.trainer.Trainer object, but that would require setting up the trainer object with all the config that goes with it.

I can get the ray.rllib.policy.policy.Policy and call export_model, I’m using SAC with torch and this successfully creates a TorchScript model. However it seems the forward function for SAC doesn’t actually do anything for an unrelated reason https://github.com/ray-project/ray/blob/master/rllib/agents/sac/sac_torch_model.py#L140

It looks like the TorchScript has the model in it but the forward func appears to be a noop

def forward(self,
    input_dict: Dict[str, Tensor],
    state: List[Tensor],
    seq_lens: Tensor) -> Tuple[Tensor, List[Tensor]]:
  _0, = state
  return (input_dict["obs"], [_0])

I haven’t had any success getting the TensorFlow version of SAC to converge and it seems to be slower than torch on my machine.

Is the only way to use the policy to recreate the training object from a checkpoint?


I think that you can do without needing the entire config that was used for training your SAC agent.
In this example, we uncheckpoint a trained experiment, but only using the environment that we originally trained the agent on:

see if this solves your problem.


1 Like

Thanks for the quick reply.

It seems this example looks like it doesn’t need config because its mostly using defaults. I managed to get it working for my env and training checkpoint but I had to create a config to change the framework and model parameters before it was loaded successfully. I guess that makes sense but it would be nice if I didn’t have to worry about exporting model metadata separate from the files that somewhere inside probably contain the same info.


1 Like