Are there any resources that can help me tune hyperparameters for my custom environment? E.g. how to use the data reported in learner_stats
in order to find better ranges for the hyperparameters? Or at least descriptions of the reported statistic?
Hey @akhodakivskiy , difficult to say. From my own experience, it certainly helps to
- check the entropy of your produced action distributions (make sure it’s not too high and not too low). Adjust via
entropy_coeff
in your config (if your algo supports this key). - check the explained variance of your value function output. Make sure it’s as close to 1.0 as possible. It may help to try a separate value function model (set
config.model.vf_share_layers=False
) - try different
lr
(learning rates) andtrain_batch_size
! - try a better Ray tune algorithm, like PBT.
To hijack this comment a bit - is there any documentation that gives a proper (in depth) explanation of every statistic reported by tune?