Inconsistent number of episodes with 'evaluate'

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

Hi!

I noticed a strange behavior with rllib evaluate in command line: depending on the environment, the number of evaluation episodes requested in the command line never get reached during the evaluation.

For example, running rllib evaluate <chkpt_path> --run PPO --env CartPole-v0 --episodes 100 will only run about 50-55 evaluation episodes (no matter the number of episodes requested in the command line as long as it is >55). I have the same problem with other environments/algorithms (but each time a different “max” number of episodes)

Here is a reproduction script:

import ray
from ray import tune
import gym
import os

ray.tune.run(
    'PPO',
    stop={
        'training_iteration': 50,
    },
    config={'env':'CartPole-v0'},
    checkpoint_freq=25,
    checkpoint_score_attr='episode_reward_mean',
    checkpoint_at_end=True,
    local_dir='checkpoint',
)

os.chdir('checkpoint/PPO')
folders=os.listdir('.')
result=[]
for filename in folders:
    if os.path.isdir(os.path.join(os.path.abspath('.'), filename)):
        result.append(filename)
chkpt_dir = max(result, key=os.path.getmtime)

os.system('rllib evaluate '+chkpt_dir+'/checkpoint_000050/checkpoint-50 --run PPO --env CartPole-v0 --episodes 200')

Hi @leo593,
I can replicate this. Looking into the èavluate.py` script it might mak sense:

parser.add_argument(
        "--steps",
        default=10000,
        help="Number of timesteps to roll out. Rollout will also stop if "
        "`--episodes` limit is reached first. A value of 0 means no "
        "limitation on the number of timesteps run.",
    )
    parser.add_argument(
        "--episodes",
        default=0,
        help="Number of complete episodes to roll out. Rollout will also stop "
        "if `--steps` (timesteps) limit is reached first. A value of 0 means "
        "no limitation on the number of episodes run.",
    )

Could it be that the number of timesteps has reached 10,000 after 50 episodes?

1 Like

Hi @Lars_Simon_Zehnder thank you for your help, you’re right ! It makes sense now :slight_smile:

1 Like