How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hello everyone,
I’m using a custom curriculum env as in
My curriculum_fn is something like this:
def curriculum_fn(
train_results: dict, task_settable_env: “TaskSettableEnv”, env_ctx: “EnvContext”) → “TaskType”:
# If all episodes in the evaluation were able to end before the max_episode_len
# we move to the next task
new_task = task_settable_env.get_task()
max_level = task_settable_env.max_level
try:
if all(ep_len < env_ctx["max_episode_len"] for ep_len in train_results["evaluation"]["hist_stats"]["episode_lengths"]):
if (new_task + 1) < max_level:
new_task += 1
print(
f"Worker #{env_ctx.worker_index} vec-idx={env_ctx.vector_index}"
f"\nR={train_results['episode_reward_mean']}"
f"\nSetting env to task={new_task}"
)
except:
print("not in evaluation phase")
continue
return new_task
What I wanted to do is to increase the level of the task only if during the evaluation phase, my agent was able to win before the max_length of the episode.
During the training phase, the environment works differently in the sense that it stops automatically after certain conditions (which did not occur in the evaluation phase).
The problem is that it only changes the level of the training phase, while the evaluation always keeps stuck at the first task. I have two questions in this regard:
-
Are the workers’ environments re-initialized after each training interaction or they are just reset? If it’s the second case, it’s strange that my evaluation_workers are always stuck at level 1.
-
Is there a way to set the task level as a “global variable” that is changed by the evaluation_workers but can also be retrieved by the “training workers” when the right conditions are met?