Rollout/test a already trained policy employing PolicyServerInput and PolicyClient

Hey guys,

I need some advice for the following:
I have a already trained policy stored in a checkpoint (trainer.save()). Now I restore from that checkpoint (trainer.restore()) and want to employ the already trained policy for an “external application”, i.e. I want to employ RLlib’s PolicyServerInput and PolicyClient classes for inference.

What’s the best practice on the trainer/server side to test (“rollout”) the trained policy?
E.g.,

trainer = PPOTrainer(config={"explore": False}, ...)
trainer.restore(checkpoint)
while True:
    trainer.train()

or

trainer = PPOTrainer(config={"explore": False}, ...)
trainer.restore(checkpoint)
while True:
    trainer.evaluate()

Or is it better to directly restore a trainer from the checkpoint on the “virtual client side” and then do action = trainer.compute_action(obs) here (so neither server nor client)?

Hi @klausk55,

Perhaps this example on how to use serve with a pertained rllib model will help.

https://docs.ray.io/en/latest/serve/tutorials/rllib.html