Controlling initialization of model weights

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I’m troubleshooting a problem with my MARL experiment. I’m using RLlib with the PPO algorithm.

Part of my observation is a Box with shape (1,), i.e. just one neuron, and the agents are learning a positive correlation between this neuron and some output neuron. I designed the environment such that a positive correlation and a negative correlation are equally lucrative, and that as soon as some of the agents decide on positive or negative, all other agents are incentivized to also choose the same correlation.

However, when I run the experiment, the agents always choose a positive correlation. This happens no matter how many times I run the experiment.

I’m wondering whether the initialization values of the model weights have anything to do with it. Maybe RLlib starts with some positive values or something that give all agents a starting bias to choose a positive correlation?

Is there any argument in RLlib that allows me to control the starting weights?

Thanks for your help,
Ram Rachum.

Hi @cool-RR

I don’t think that feature exists in native RLLIB. It is not included in the ModelConfigDict. Not sure what framework you are using but with Tensorflow you can build a custom model and use the standard keras weight and bias initializers when defining your model(s). Something similar probably exists in torch.

If you have some specific weights and biases you want to start from say from some earlier training or even by “hand” this example although single agent and a few ray versions old may help you. See specifically the subclassing of the ppo algorithm in line 115 and the loading of weights and biases in line 120 and the important sync of workers in line 121. You probably need to define this for each agent/policy in MARL as well.

BR

Jorgen

Thanks Jorgen! I’ll check out these links.

It should be possible to initialize the weights given a framework checkpoint. If you use a custom model and pass in the path to the model weights in the model_config dict then in the constructor you can initialize the weights. I’m doing that currently and will update the thread if it does not work!