That being said, I have managed to get an almost identical setup working with an external env for my game with a PPO trainer but the synchronous overhead tends to bottleneck things too much (mainly because the server I have is upload bandwidth starved) so remote clients take too long to update. This is where I was hoping that DD-PPO would have less internet overhead and if it it’s the same, as long as the updating of values from the main process is non-blocking to the main actor playing the game, that would be perfect.
Please let me know if my hopes/understanding of DD-PPO here is flawed!
Actually, @Denys_Ashikhin , this is a known issue and unfortunately not captured with a proper error message. DD-PPO uses torch.distributed.init_process_group which throws an error on Win.
I’ll add a check to DD-PPO’s validate_config and at least provide a meaningful NotSupported error message.