Pytorch DistributedTrainable Tune Report on Rank 0 Only

We are running deep reinforcement learning so our data comes from simulations running in child processes spawned by each rank. Also, we have custom distributed communication points to save learning curves and observations from the simulations. My goal is to wrap ray tune around that without changing a lot.

I have things working other than the checkpointing. See Checkpoint Discussion.