Happy 2021 everyone
Today I made it my goal to speed test ray with cluster setup. It didn’t go so well, maybe I can submit to the docs a use-case example like this one when I have it working?
My setup as follows:
- Head Node: VPC with 4 cores, no firewall, Ray 1.1.0, Python 3.6.9
- Worker Node: Google Colab with 2 cores, firewall, Ray 1.1.0, Python 3.6.9
Steps taken:
- on head node - ray start --head
- on worker node - ray start --address=‘8.8.8.8:6379’ --redis-password=‘0000000000000000’ # note address has been obfuscated
- on head node - python custom_tf_policy.py
custom_tf_policy.py is located at https://github.com/ray-project/ray/blob/master/rllib/examples/custom_tf_policy.py#L48
I changed line 48 to ray.init(address=‘auto’, _redis_password=‘0000000000000000’)
I changed line 56 to “num_workers”: 5,
Observations
The head node starts perfectly, as does connecting the worker to the head. The output looks normal when executing the script “custom_tf_policy.py”, but it doesn’t make it to training:
2021-01-01 22:08:29,908 INFO worker.py:657 – Connecting to existing Ray cluster at address: 8.8.8.8:6379
== Status ==
Memory usage on this node: 1.6/7.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 6/14 CPUs, 0/4 GPUs, 0.0/33.84 GiB heap, 0.0/10.16 GiB objects
Result logdir: /home/ray/ray_results/MyCustomTrainer_2021-01-01_22-08-29
Number of trials: 1/1 (1 RUNNING)
±----------------------------------------±---------±------+
| Trial name | status | loc |
|-----------------------------------------±---------±------|
| MyCustomTrainer_CartPole-v0_e153c_00000 | RUNNING | |
±----------------------------------------±---------±------+
There are a few INFO and WARNING messages but nothing related to workers, network, or anything unusual. A ctrl-c yields the following:
File “/home/ray/.local/lib/python3.6/site-packages/ray/tune/tune.py”, line 419, in run
runner.step()
File “/home/ray/.local/lib/python3.6/site-packages/ray/tune/trial_runner.py”, line 360, in step
self._process_events() # blocking
File “/home/ray/.local/lib/python3.6/site-packages/ray/tune/trial_runner.py”, line 469, in _process_events
trial = self.trial_executor.get_next_available_trial() # blocking
File “/home/ray/.local/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py”, line 472, in get_next_available_trial
[result_id], _ = ray.wait(shuffled_results)
File “/home/ray/.local/lib/python3.6/site-packages/ray/worker.py”, line 1513, in wait
worker.current_task_id,
Here’s the full script to save you from opening the page:
import argparse
import osimport ray
from ray import tune
from ray.rllib.agents.trainer_template import build_trainer
from ray.rllib.evaluation.postprocessing import discount_cumsum
from ray.rllib.policy.tf_policy_template import build_tf_policy
from ray.rllib.utils.framework import try_import_tftf1, tf, tfv = try_import_tf()
parser = argparse.ArgumentParser()
parser.add_argument(“–stop-iters”, type=int, default=200)
parser.add_argument(“–num-cpus”, type=int, default=0)def policy_gradient_loss(policy, model, dist_class, train_batch):
logits, _ = model.from_batch(train_batch)
action_dist = dist_class(logits, model)
return -tf.reduce_mean(
action_dist.logp(train_batch[“actions”]) * train_batch[“returns”])def calculate_advantages(policy,
sample_batch,
other_agent_batches=None,
episode=None):
sample_batch[“returns”] = discount_cumsum(sample_batch[“rewards”], 0.99)
return sample_batchMyTFPolicy = build_tf_policy(
name=“MyTFPolicy”,
loss_fn=policy_gradient_loss,
postprocess_fn=calculate_advantages,
)MyTrainer = build_trainer(
name=“MyCustomTrainer”,
default_policy=MyTFPolicy,
)if _name_ == “_main_”:
args = parser.parse_args()
ray.init(address=‘auto’, _redis_password=‘0000000000000000’)
tune.run(
MyTrainer,
stop={“training_iteration”: args.stop_iters},
config={
“env”: “CartPole-v0”,
# Use GPUs iffRLLIB_NUM_GPUS
env var set to > 0.
“num_gpus”: int(os.environ.get(“RLLIB_NUM_GPUS”, “0”)),
“num_workers”: 5,
“framework”: “tf”,
})