I have trained a PPO model using Rllib and saved a checkpoint. Now, I want to load the checkpoint and run a custom evaluation function where I call env.step. Roughly, I want the structure to be similar to this:
> from ray.rllib.algorithms.algorithm import Algorithm > > def custom_eval_function(algo): > ... > action = # get action > state, reward, truncated, terminated, info = env.step(action) > ... > > algo = Algorithm.from_checkpoint(path)
How can I step through the environment in this way instead of using algo.evaluate()?