Jump-Start Reinforcement Learning

mannyv · December 11, 2024, 2:34am

It looks like the jump-start has helped in that you have a higher win rate and the variance of the win rate appears much lower. Any idea why you the performance degrades over training in both cases?

I feel like a broken record because I point a lot of people in this direction. But there can be some issues training PPO with continuous activations. If you have not seen it you might want to give this post a read to see if you are experiencing this issue.

Topic		Replies	Views
PPO nan in actor logits RLlib	7	905	October 1, 2024
Error: nan Tensors in PyTorch with Ray RLlib for MARL RLlib	12	1265	August 10, 2024
ValueError: Expected parameter logits (...) to satisfy the constraint IndependentConstraint(Real(), 1) RLlib	38	9175	October 14, 2024
TrajectoryTracking with RLLIB RLlib	14	1370	November 17, 2021
Unable to replicate original PPO performance RLlib	0	211	May 10, 2024

Jump-Start Reinforcement Learning

Related topics