Running RLlib with Ray Tune on GCP

Alright, in the last weeks I found out how to set up an experiment using DQN on GCP:

RLlib DQN example
Using the example-full.yaml I had to make the following changes:

  1. See issue #3858 in #ray-clusters, where I also posted the solution.
  2. About the .yaml file for the Trainer configuration I have no news yet … less priority.
  3. So the full example is not there because of 2. missing, but the example in issue #3858 should run for anyone who wants to try this out.

At this point many thanks to the @asawari for this awesome work and for providing so many examples: Setting up the cluster and running the scripts runs amazingly smoothly!!

Custom example
The custom example runs similarly and in the way I expected in above:

  1. The code is send to the head node by using ray rsync-up as shown above uploading all necessary files to the cluster.
  2. To run the main.py I used ray exec as shown above and the code ran errorless.

Hope this helps others, who stand at the same point in their projects.