How do RAY calculate the number of Parameters(weights and bias)?

Xim_Lee · June 28, 2021, 2:22am

Hi,
I have a question.
How do RAY calculate the number of Parameters?
I used PPO algorithm and input_dim = 16, hiddens = [256,256] and my action space = 4 dimensional.
I have attached a picture below, but the result is strange.

where is it comes number ‘8’?
if I use the Compute_action function, I get the 4-dimensional action space.
Could anyone explain me?

Xim_Lee · June 29, 2021, 1:33am

state = env.reset()
action = agent.compute_action(state)

    out = F.relu(F.linear(torch.from_numpy(state).float(), torch.from_numpy(policy_wei[1][1]),
                          torch.from_numpy(policy_bias[1][1])))
    out = F.relu(F.linear(out, torch.from_numpy(policy_wei[2][1]),
                          torch.from_numpy(policy_bias[2][1])))
    out = F.tanh(F.linear(out, torch.from_numpy(policy_wei[0][1]),
                          torch.from_numpy(policy_bias[0][1])))

policy out: tensor([ 0.3966,  0.6118,  0.5565,  0.0270, -0.9395,  0.8203, -0.0793, -0.3315])
state: [0.09984301 0.09992453 0.09992143 0.09991941 0.29046342 0.87227278
 0.98566566 0.10100834 0.21494237 0.         0.5        0.
 0.5        0.         1.         0.25      ] 
action: [-0.85631555  1.          0.8432337  -0.7363308 ]

Why do i get different result of compute_action and Policy network about same state.
How do RAY’s compute_action work?

mannyv · June 29, 2021, 3:05am

Have a look here for an overview: RLlib Models, Preprocessors, and Action Distributions — Ray v2.0.0.dev0

Topic		Replies	Views
How to get the weight? RLlib	2	984	June 30, 2021
Inconsistent actions from Algorithm.compute_single_action RLlib	3	389	June 14, 2023
[SOLVED] Installing 2.0.0dev but I don't see compute_actions correct parameters Ray Core	2	225	July 4, 2021
Next action in RLlib VisionNetworks RLlib	4	494	April 27, 2021
Action masking for multi-agent DQN RLlib	1	1066	February 23, 2023

How do RAY calculate the number of Parameters(weights and bias)?

Related topics