Hi all,
I’m currently using the LSTM wrapper im combination with FCNet. Now from my understanding the layers are shared here so I need to tune the VF_coeff. My current value is 1.25, however, I get roughly:
policy_loss: 0.04
vf_loss: 0.0001
My rewards are in the -0.01 - (max of 10, but realisticly) 3.0.
Is this fine? Or should I increase the coefecient to increase the vf_loss value? Could this be impacting my training?