TensorFlow MultiWorkerMirroredStrategy in RLlib

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

I looked at the DDPPO code and noted that it only runs with the torch framework due to its distribute module. I also saw this issue on GitHub.

I would like to have also a TensorFlow version of DDPPO and try o implement it. In addition, I also want to parallelize my exploration algorithm that must work with the same mechanism.

As far as I understand this it needs at first the RolloutWorkers added to the TensorFlow cluster (needs a TF_CONFIG environment variable that holds the cluster config with the worker addresses - have coded that already) and then at the point where the models get created the MultiWorkerMirroredStrategy needs to be used as a scope (could be done by writing a DDPPOTFPolicy).

Could someone tell me, if it might now be able to use TensorFlow MultiWorkerMirroredStrategy with DDPPO or elaborate a little about why it was (is) not possible?

Thanks!

This question was discussed during May 24th RLlib office hours. RLlib Office Hours - YouTube. Suggestion was to test the custom DDPO algorithm using PyTorch first.