How to use PPOTorchPolicy.with_updates in Ray 1.9+?

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am trying to upgrade from Ray 1.8.0 to Ray 1.9.2 (and later hopefully to the latest version) in our code base.
After the upgrade, I have multiple tests failing because we had been using PPOTorchPolicy.with_updates(...), which seems to be gone in 1.9.2. For example, we used to have

CentralPPOPolicy = PPOTorchPolicy.with_updates(
    name="CentralPPOPolicy",
    postprocess_fn=postprocess_central_ppo,
    loss_fn=central_ppo_surrogate_loss,
    stats_fn=sgd_and_other_stats,
)

In Ray 1.9.2 this fails with AttributeError: type object 'PPOTorchPolicy' has no attribute 'with_updates'. How can I achieve the same behavior for the above with Ray 1.9.2+?

Here is my understanding so far:

I feel like there should be a simple fix/approach for this. At the moment, the upgrade is blocked due to this. Thanks for any help!

Hi @stefanbschneider ,

what happens, if you use the static build_policy_class() in 9.1.2 and first build the PPOTorchPolicy and then execute your code?

from ray.rllib.policy.policy_template import build_policy_class

PPOTorchPolicy = build_policy_class("PPOTorchPolicy")
CentralPPOPolicy = PPOTorchPolicy.with_updates(
    name="CentralPPOPolicy",
    postprocess_fn=postprocess_central_ppo,
    loss_fn=central_ppo_surrogate_loss,
    stats_fn=sgd_and_other_stats,
)
1 Like

Hey @stefanbschneider ,

I know that we have started to deprecate build_trainer. Using build_policy_class has long been recommended but maybe this will get deprecated, too?

Have you tried simply subclassing the policy and overwriting the functions that you are passing?

Best

1 Like

Hi @Lars_Simon_Zehnder and @arturn thanks for your quick support!

As a quick fix, I went with @Lars_Simon_Zehnder suggestion, which seems to work fine for now. I just had to already define some of the functions in the build_policy_class:

PPOTorchPolicy = build_policy_class(
    "PPOTorchPolicy",
    framework="torch",
    loss_fn=central_ppo_surrogate_loss,
    postprocess_fn=postprocess_central_ppo,
    stats_fn=sgd_and_other_stats,
)
CentralPPOPolicy = PPOTorchPolicy.with_updates(name="CentralPPOPolicy")

Thanks again!

1 Like

Hi @arturn , I saw that build_torch_policy() is depricated in favor of subclassing directly: ray/torch_policy_template.py at master · ray-project/ray · GitHub

But surprisingly, it’s not deprecated for tensorflow: ray/tf_policy_template.py at master · ray-project/ray · GitHub

Do you know why and what the best practice for building TF Policies is at the moment?

1 Like

Thanks for the timely question.
You are right, we are deprecating the builder pattern for Trainers and Policies, and in general prefer simple subclassing everywhere.
I will hopefully migrating all the policy classes in the next couple of weeks, including both TF and Torch policies.
You should be able to simply sub-class PPOTorchPolicy for your use case?
Thanks again.

2 Likes

Hi @gjoliver , thanks for the quick response and for clarifying the roadmap.

Yes, sub-classing PPOTorchPolicy works perfectly fine. I was now wondering about sub-classing DQNTFPolicy. Can I also subclass that or what’s the best practice here at the moment?

yeah, please do.
shouldn’t happen, but let us know if you run into any problems.
I am gonna hopefully migrate everything very soon.

1 Like