My Ray programs stops learning when using distributed compute

For anyone ending up here, please checkout other training architectures. SimpleQTrainer is just too simple and does not support this, but others do!

The simplest one that supports the above is: DQNTrainer. Try this one instead of SimpleQTrainer. The parameter to control the train/experience-collection ratio is config_simple["training_intensity"].

Good luck!