Issue with custom environment

PatrickSampaioUSP · April 22, 2021, 10:04am

Hi, I’m using a custom environment, I was using stable-baslines3 for training but I’m trying to migrate to RLlib for scaling purposes. However after several training iterations, tuning hyperparams my model never converged to a result nearly as good as in the stabe-baselines3 and I tried to run a episode with a trained file to see what is going on. My agent is always using the maximum actions like:

array([-1., 1., 1., 1., -1.], dtype=float32)

(Actions normalized between -1 and 1)

However my environment does not have a bug because I’m running OK in the stable-baselines3 and I don’t think there is a issue with my RLlib config, therefore I don’t know why my agent is bugging in those extreme actions.

sven1977 · April 27, 2021, 9:51am

Hey @PatrickSampaioUSP , thanks for posting this issue. Would you be able to share your environment and config so we can take a look? It’s impossible to figure out why RLlib would not learn otherwise. Thanks.

mjlbach · July 27, 2021, 12:09am

I have a reproduction in Issues reproducing stable-baselines3 PPO performance with rllib

Topic		Replies	Views
Reproducing results from stablebaselines 3 RLlib	2	653	August 6, 2021
Unable to replicate original PPO performance RLlib	0	179	May 10, 2024
Num_env & agent_steps_trained 0 even though steps sampled? RLlib	7	877	April 25, 2024
Issues reproducing stable-baselines3 PPO performance with rllib RLlib	14	2519	March 16, 2022
MultiAgent training Issues RLlib	1	516	April 9, 2024

Issue with custom environment

Related topics