Action masking error

Hi @kourosh,

I have no idea what is going on here but, I was discussing an issue with @Lars_Simon_Zehnder recently and we have both independently debugged cases recently where the policy has exploding loss values which results in NaN logits from the policy which ends up as illegal action values in Discrete spaces (1 larger than the size of the space).

Just a heads up because I think I have seen a couple other posts in the forumns that sound similar.

1 Like