If use tf2, the model.summary() of keras can help output the model summary, and the rllib also defines the base_model. But if use Pytorch, how to output the model summary?
Hey @bug404 . If you have an RLlib Trainer object:
print(trainer.get_policy().model)
If you are using tune.run, try to add this line into ray/rllib/policy/torch_policy.py:~164 (after we assign self.model to the 0th of the multi-GPU towers):
print(self.model)
It’s very helpful, thank you very much.
It’s a very cool way modifying the source code to support this feature, haha.
Hey @sven1977 ,
I use policy.model.base_model.summary()
to output shape of model, but it reports
AttributeError: 'FullyConnectedNetwork' object has no attribute 'base_model'
.
So I just use trainer.get_policy().model
to see it, and output is:
FullyConnectedNetwork(
(_logits): SlimFC(
(_model): Sequential(
(0): Linear(in_features=32, out_features=6, bias=True)
)
)
(_hidden_layers): Sequential(
(0): SlimFC(
(_model): Sequential(
(0): Linear(in_features=20, out_features=32, bias=True)
(1): ReLU()
)
)
(1): SlimFC(
(_model): Sequential(
(0): Linear(in_features=32, out_features=64, bias=True)
(1): ReLU()
)
)
(2): SlimFC(
(_model): Sequential(
(0): Linear(in_features=64, out_features=32, bias=True)
(1): ReLU()
)
)
)
(_value_branch_separate): Sequential(
(0): SlimFC(
(_model): Sequential(
(0): Linear(in_features=20, out_features=32, bias=True)
(1): ReLU()
)
)
(1): SlimFC(
(_model): Sequential(
(0): Linear(in_features=32, out_features=64, bias=True)
(1): ReLU()
)
)
(2): SlimFC(
(_model): Sequential(
(0): Linear(in_features=64, out_features=32, bias=True)
(1): ReLU()
)
)
)
(_value_branch): SlimFC(
(_model): Sequential(
(0): Linear(in_features=32, out_features=1, bias=True)
)
)
)
The input latitude of my environment is 20 and the output latitude is 3. And the definition of environment and network initialization are as follows:
class Env(gym.Env)
def __init__(self):
self.action_space = Box(-1, 1, [3, ])
self.observation_space = Box(-1, 1, [20, ])
...
------------------
ray.init()
config = DEFAULT_CONFIG.copy()
config['model']['fcnet_hiddens'] = [32, 64, 32]
config['model']['fcnet_activation'] = "relu"
...
So I’m a little confused about that the out_features
of network is not 3.
Hi @Glaucus-2G,
If you take a look at this dataflow diagram, the part you are describing is the green model box on the right part of the image. The outputs of the model are the logits that are sent to the ActionDistribution box. It is the ActionDistribution box that converts those logits into the actual actions.
In your example you will get 32 logit outputs from the model and 3 action outputs from the ActionDistribution. Does this make sense?
Hey @mannyv ,
Thank you for your reply!You helped me understand this content very well. I have ignored the function of the ActionDistribution box before.