I have trained a PPO model using Rllib and saved a checkpoint. Now, I want to load the checkpoint and run a custom evaluation function where I call env.step. Roughly, I want the structure to be similar to this:
> from ray.rllib.algorithms.algorithm import Algorithm
>
> def custom_eval_function(algo):
> ...
> action = # get action
> state, reward, truncated, terminated, info = env.step(action)
> ...
>
> algo = Algorithm.from_checkpoint(path)
How can I step through the environment in this way instead of using algo.evaluate()?