I’m trying to train an agent on a custom environment, but training is so slow even with the default cartpole example, with or without a gpu.
My set up is:
Windows 10
Ray 1.3.0
Tensorflow 2.4
python 3.7.10
when I run:
from ray import tune
from ray.rllib.agents.ppo import PPOTrainer
myconfig = {“env”: “CartPole-v0”}
myconfig[“num_workers”] = 12
myconfig[“num_gpus”] = 0
myconfig[“log_level”] = “WARN”
#myconfig[“framework”] = “torch” #also tried with tensorflow 1,2, eager
tune.run(PPOTrainer, config=myconfig)
timers:
learn_throughput: 580.592
learn_time_ms: 8267.423
sample_throughput: 773.311
sample_time_ms: 6207.078
update_time_ms: 6.689
and the time per iteration is about 15 seconds. I’m mostly concerned about the learn_time_ms since, as I understand, that is the sgd update step while the sample_time_ms is for trajectory collection (it seems a little slow too?).