Thanks @sven1977 for your reply. There was a mismatch between the environment documentation and the code so i set n_sensors = 30. But , now I get the error:
ValueError: Input 1 is incompatible with layer model_1: expected shape=(None, 968), found shape=(2000, 242)
but if I try (out of curiocity) to double the shape dim in this layer setting:
opp_obs = tf.keras.layers.Input(shape=(2*opp_obs_dim, ), name="opp_obs")
I get the error:
ValueError: Input 1 is incompatible with layer model_1: expected shape=(None, 1936), found shape=(None, 968)
that seems odd to me since this is the correct found shape (opp_obs_dim = 968). In any case, I saw again the postprocessing function as you said and I think that the problem might be the initialisaton. Do you think I should change something here ?
Policy hasn’t been initialized yet, use zeros.
sample_batch[OPPONENT_OBS] = np.zeros_like([np.zeros((obs_dim * (n_pursuers - 1)))])
sample_batch[OPPONENT_ACTION] = np.zeros_like([np.zeros(act_dim * (n_pursuers - 1))])
### I think I don’t have to change this
sample_batch[SampleBatch.VF_PREDS] = np.zeros_like(sample_batch[SampleBatch.REWARDS], dtype=np.float32)
The only thing that I changed to the code is the n_sensors = 30 in the beginning.