Implementing Dirichlet distribution

hossein836 · February 9, 2022, 1:54pm

Hi, I’m relatively new to Rllib and just start exploring. I was trying to implement a Dirichlet distribution for a resource allocation system. since I didn’t find any real example in this topic I wanna ask if this distribution is available in PPO or not? I had a try but I got this error:

module 'gym.spaces' has no attribute 'Simplex'

and its natural because gym does not have simplex . so how can I implement Dirichlet?

Q2. what do the dimensions of simplex mean? for example , Simplex(shape=(3, 4))
suppose I have 2 batteries responsible for charging 4 wheels and they are all connected. what is the dimensions of simplex that I should use?

gjoliver · February 14, 2022, 6:44pm

This seems more like a general RL question than RLlib specific.
I’d suggest you ask in some other RL forums.
Do you have any example of a working Simplex gym space?
As for the Dirichlet, you can always add more action distributions to RLlib following this API:

github.com

ray-project/ray/blob/master/rllib/models/action_dist.py

import numpy as np
import gym

from ray.rllib.models.modelv2 import ModelV2
from ray.rllib.utils.annotations import DeveloperAPI
from ray.rllib.utils.typing import TensorType, List, Union, ModelConfigDict


@DeveloperAPI
class ActionDistribution:
    """The policy action distribution of an agent.

    Attributes:
        inputs (Tensors): input vector to compute samples from.
        model (ModelV2): reference to model producing the inputs.
    """

    @DeveloperAPI
    def __init__(self, inputs: List[TensorType], model: ModelV2):
        """Initializes an ActionDist object.

This file has been truncated. show original

Should be relatively easy.

hossein836 · February 16, 2022, 12:57pm

thanks @gjoliver for your answer.

you can find Dirichlet dist in torch distribution Here in line 537. so Rllib support Dirichlet distribution in torch frameworks(I use torch often). you can find an implement in this article.
my problem was in defining action space. since I use gym envs, I defined my action space like the example of Rllib in utils code here. I had imported that code of course.
So I was confused how to define my action space for Dirichlet dist (I got above error for simplex).
now I’m considering to discretized action space and then use multi distribution action mask for these problems.
if you know a good forum/community please introduce. that would be much appreciated

gjoliver · February 21, 2022, 8:08pm

oh, if Rllib already has Simplex space implementation, shouldn’t you just use it like:

from ray.rllib.utils.spaces import simplex

class YourEnv(gym.Env):
    def __init__(self):
        ...
        self.action_space = simplex.Simplex(shape=(dim1, dim2))
        ...

?

You can’t use it like gym.spaces.Simplex, because gym.spaces doesn’t have it.

hossein836 · February 22, 2022, 8:54am

you are right, I thought I wrote my code as you mentioned. that’s why I’m confused but it’s possible that I’m wrong( I was working on 4 envs simultaneously ). I will test again and I’ll share the result if you are interested .
my initial thought is it could yield a better result ( specially if you don’t want to discretize) . I’m going to test :
multi discrete action masked VS Dirichlet distribution
there are pros and cons albeit.
I try to do this in two weeks
thanks a lot

hossein836 · March 12, 2022, 8:38am

well, the model is implemented. also several points that I want to mention:

first I got negative entropy in the result and I got confused so I did some researches and it cleared for me that its ok sometimes for continuous distributions to have negative entropy. the pattern of entropy was exact what I had expected.
don’t use torch it is not implemented yet and use tf2.
using this distributions can be beneficial, for reference look here Dirichlet distribution is a generalized beta

Topic		Replies	Views
Custom action space Configure Algorithm, Training, Evaluation, Scaling	4	588	July 31, 2023
Failed to use Simplex space with shape=(1,3) and uniform concentration RLlib	1	52	January 24, 2025
Parameterised (hierarchical) action space using RLlib Configure Algorithm, Training, Evaluation, Scaling	0	412	May 30, 2023
Undestanding the expected output shapes of a Recurrent model with Dict Action Space Configure Algorithm, Training, Evaluation, Scaling	2	292	January 15, 2024
Fetch action probability distribution from trained policy RLlib	7	661	March 18, 2023

Implementing Dirichlet distribution

Related topics