Passing 'custom_action_dist'

I am on Ray 2.0 and training PPO with a Dirichlet action space.

I am training my model like this:

 tuner = tune.Tuner("PPO", param_space=config,
                                        name =  experiment_name,
    results =

Where does the “custom_action_dict” parameter goes now? Since the new config dict has changed from the old examples on the website.

To give more background, these are all the steps I performed:

  1. Import the “Simplex” action space from RLLIB and use it in the init on self_action_space

    from ray.rllib.utils.spaces.simplex import Simplex

  2. Import the Dirichlet action space from RLLIB:
    from ray.rllib.models.torch.torch_action_dist import TorchDirichlet as Dirichlet

  3. Register the new action space:
    from ray.rllib.models import ModelCatalog
    ModelCatalog.register_custom_action_dist("Dirichlet", Dirichlet)

  4. Pass the “custom_action_dict” to the trainer.
    This is the part that I don’t know how to do (when using Tune to train) since the config dict has changed on Ray 2.0 from the examples on the website.

Hello @mannyv . Thank you very much for your pointer, but I guess there is something else going on. I am getting this error, which is usually a “catch all” (or “red herring”) for some other error somewhere else:

AttributeError: 'PPO' object has no attribute '_warmup_time'

The error above is missleading, as I believe this is the issue going on (see below). RLLIB is trying to calculate the KL divergence and is calling the Dirichlet Class for it. I am not sure whether I am doing the steps correctly and importing the right things

 File "/usr/local/lib/python3.9/dist-packages/ray/rllib/models/torch/", line 643, in kl
    return self.dist.kl_divergence(other.dist)
AttributeError: 'Dirichlet' object has no attribute 'kl_divergence'

I see on the official implementation here of the Dirichlet Class that the existing method is called “kl” and not “kl_divergence”

To me, in the official code here this line is missing:

 def kl(self, other):
        return torch.distributions.kl.kl_divergence(self.dist, other.dist)

I’ve created a minimal example of the error here:

To me, this is a bug. Either the KL-divergence is not correct, and should be amended as I propose. Or the option I am using now in my code is to just delete the KL method and have it retrieved from the parent class.

Hey @Username1, You are right. Thanks for bringing up the bug. I have just made a PR to fix this issue. Torch.Dirchelet is not something we have good test coverage for.

The fix basically inherits the default kl computation logic from parent which is indeed what you suggested.

Thank you angel for coming to my rescue! I’ve been scratching my head for a week! Cheers and case closed!