My Ray programs stops learning when using distributed compute

Stefan-1313 · August 16, 2022, 10:03am

For anyone ending up here, please checkout other training architectures. SimpleQTrainer is just too simple and does not support this, but others do!

The simplest one that supports the above is: DQNTrainer. Try this one instead of SimpleQTrainer. The parameter to control the train/experience-collection ratio is config_simple["training_intensity"].

Good luck!

Topic		Replies	Views
Num_gpu, rollout_workers, learner_workers, evaluation_workers purpose + resource allocation Configure Algorithm, Training, Evaluation, Scaling	8	2203	August 24, 2023
Lack of convergence when increasing the number of workers RLlib	18	596	February 18, 2025
Ray Tune process hangs - thread synchronization issue? Ray Tune	16	832	February 10, 2023
How to use rllib to conduct distributed training on multiple machines at the same time Configure Algorithm, Training, Evaluation, Scaling	5	827	February 20, 2023
Seeking recommendations for implementing Dual Curriculum Design in RLlib RLlib	13	731	April 11, 2023

My Ray programs stops learning when using distributed compute

Related topics