[Rllib] Centralised critic PPO for multiagent env (pettingzoo waterworld)

george_sk · March 19, 2021, 11:08am

Thanks @sven1977 for your reply. There was a mismatch between the environment documentation and the code so i set n_sensors = 30. But , now I get the error:

ValueError: Input 1 is incompatible with layer model_1: expected shape=(None, 968), found shape=(2000, 242)

but if I try (out of curiocity) to double the shape dim in this layer setting:

    opp_obs = tf.keras.layers.Input(shape=(2*opp_obs_dim, ), name="opp_obs")

I get the error:

ValueError: Input 1 is incompatible with layer model_1: expected shape=(None, 1936), found shape=(None, 968)

that seems odd to me since this is the correct found shape (opp_obs_dim = 968). In any case, I saw again the postprocessing function as you said and I think that the problem might be the initialisaton. Do you think I should change something here ?

Policy hasn’t been initialized yet, use zeros.

sample_batch[OPPONENT_OBS] = np.zeros_like([np.zeros((obs_dim * (n_pursuers - 1)))])
sample_batch[OPPONENT_ACTION] = np.zeros_like([np.zeros(act_dim * (n_pursuers - 1))])
### I think I don’t have to change this
sample_batch[SampleBatch.VF_PREDS] = np.zeros_like(sample_batch[SampleBatch.REWARDS], dtype=np.float32)

The only thing that I changed to the code is the n_sensors = 30 in the beginning.

Topic		Replies	Views
PPO centralized critic example with more than two agents RLlib	4	1905	October 19, 2021
Train centralized_critic PPO and PPO at the same time RLlib	8	178	February 19, 2025
Multi-Agent with Centralized Critic using an Attention Model Configure Algorithm, Training, Evaluation, Scaling	0	294	January 18, 2024
More than 2 agents with centralized critic 2 example RLlib	0	291	October 28, 2022
Best Practices for Implementing a Shared Critic? Configure Algorithm, Training, Evaluation, Scaling	7	230	November 11, 2025

[Rllib] Centralised critic PPO for multiagent env (pettingzoo waterworld)

Policy hasn’t been initialized yet, use zeros.

Related topics