Inconsistent number of episodes during evaluation

Hi there,

I’ve been playing around with the evaluation routine with the trainers. I’m having difficulty getting the rollout workers to do a consistent number of evaluations per time.

For example, if I set the evaluation_num_episodes to 300 with 3 workers, so the evaluate routine calls worker.sample.remote() 300 times.
With the default config I have a very variable number of episodes per evaluation: (len is mean episode_len)

episodes: 445, len: 11.3                                                                                                                                                                      
episodes: 582, len: 7.5                                                                                                                                                                       
episodes: 442, len: 11.2                                                                                                                                                                      
episodes: 450, len: 10.5                                                                                                                                                                      
episodes: 429, len: 11.1                                                                                                                                                                      
episodes: 442, len: 10.8                                                                                                                                                                      
episodes: 551, len: 8.3                                                                                                                                                                       
episodes: 409, len: 11.5                                                                                                                                                                      
episodes: 479, len: 10.0                                                                                                                                                                      
episodes: 404, len: 12.3                                                                                                                                                                      
episodes: 450, len: 10.8                                                                                                                                                                      
episodes: 383, len: 13.0                                                                                                                                                                      
episodes: 401, len: 12.4                                                                                                                                                                      
episodes: 374, len: 13.8                                                                                                                                                                      
episodes: 411, len: 11.7                                                                                                                                                                      
episodes: 370, len: 13.5                                                                                                                                                                      
episodes: 381, len: 13.4                                                                                                                                                                      
episodes: 350, len: 14.7                                                                                                                                                                      
episodes: 324, len: 16.3                                                                                                                                                                      
episodes: 339, len: 15.8  

With rollout_fragment_length set to 1 I get better results:

episodes: 310, len: 11.2                                                                                                                                                                      
episodes: 294, len: 7.3                                                                                                                                                                       
episodes: 302, len: 11.7                                                                                                                                                                      
episodes: 300, len: 11.0                                                                                                                                                                      
episodes: 300, len: 11.2                                                                                                                                                                      
episodes: 295, len: 10.9                                                                                                                                                                      
episodes: 305, len: 8.5                                                                                                                                                                       
episodes: 299, len: 12.3                                                                                                                                                                      
episodes: 301, len: 10.5                                                                                                                                                                      
episodes: 302, len: 12.6                                                                                                                                                                      
episodes: 295, len: 11.1                                                                                                                                                                      
episodes: 302, len: 13.9                                                                                                                                                                      
episodes: 300, len: 12.4                                                                                                                                                                      
episodes: 298, len: 12.4                                                                                                                                                                      
episodes: 303, len: 12.2                                                                                                                                                                      
episodes: 299, len: 13.8                                                                                                                                                                      
episodes: 300, len: 14.1                                                                                                                                                                      
episodes: 300, len: 14.4                                                                                                                                                                      
episodes: 296, len: 16.3                                                                                                                                                                      
episodes: 304, len: 15.4    

However it’s still pretty inconsistent which is frustrating and sometimes it’s now less than the specified number of episodes which I think is worse. the batch mode didn’t seem to have any effect on the numbers.

Is there any better way to sample evaluations, or have I just missed a configuration option?

Cheers,

Rory

You want to use a low-throughput algorithm. IMPALA/APE-X will run the trainer in its own loop, so after finishing a batch, it will start training on whatever is in the queue. If the queue is not full, it will train on fewer episodes.

1 Like