How to get and use a trained policy

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Description

I have been training single agent models using some of the built-in gym environments (cartpole and frozen lake) as well as some custom environments. Note that I am using Ray 2.35.0. After successfully training a model, I would like to be able to use it to manually select actions based on novel observations that are not part of the training dataset. Can anyone advise on how to do this with the new API stack?

Note that this documentation describes how to get the trained policy for export, but it does not describe how to inference against it. In any case, it seems that the get_policy method that is referenced in this documentation does not work. Here is the basic sequence of commands that I have tried:

config = (
    PPOConfig()
    ...
)

algo = config.build()

for i in range(100):
    result = algo.train()

ppo_policy = algo.get_policy()

When I try to use the get_policy method, I get the following error:

AttributeError: 'SingleAgentEnvRunner' object has no attribute 'get_policy'

It seems that others have encountered this issue (see this ticket in github). This seems like a pretty major issue as it will prevent usage of trained models. Is there some other workflow for deploying trained RL agents in the new API?