Okay, I solved it partially. In short: Use Tensorflow as the framework.
During solving this bug, I think Rllib’s support for torch, especially when dealing with MultiDiscrete input, seems to have certain problems.
As I am modifying from this
example script from rllib’s repo, I set the default framework of training to be torch as 1) the algorithm that I am going to train is implemented in torch, and 2)judging from the default parameter setting, torch seems to be the default choice.
Then it gives me the previously mentioned error:
AttributeError: ‘TorchCategorical’ object has no attribute ‘log_prob’.
However, from the torch repo, ‘TorchCategorical’ has this attribute.
After going through the scripts, I couldn’t solve the problem. But then when I looked back at the original example script, I found the actual default framework is Tensorflow ‘tf’. And I tested using ‘tf’ as my framework, the training process was successfully executed.
I also tested other example scenes released by Unity’s ml-agent, setting the framework to be ‘torch’ instead of ‘tf’, and the same error appeared again if the action space is MultiDiscrete.
The tutorial for this test can be found here:Unity-and-rllib
As my algorithm implementation is based on torch, I might still need to find a solution to this problem…But if you just want to play around with rllib’s included algorithms, just modify from the example script and choose Tensorflow!