When run PPO,it can not calculate episode reward

foc689 · August 18, 2023, 2:56am

hi,there. when i use ray to run my custom env_PPOConfig,it show below:

(RolloutWorker pid=1954982) 2023-08-17 11:19:07,224     WARNING env.py:162 -- Your env doesn't have a .spec.max_episode_steps attribute. Your horizon will default to infinity, and your environment will not be reset.

and this will lead to episode_reward_mean = nan ,
And i also see similar question ,but don’t know how to set .spec.max_episode_steps , and the new version 2.6.3 has no attribute horizon, i don’t know how to deal this problem . Wish anyone do me a favour .
Thanks

Topic		Replies	Views
PPO only run several steps in one episode RLlib	1	44	September 10, 2024
Unable to get 'episode_reward_mean' RLlib	3	142	January 3, 2025
Understanding agent_timesteps_total RLlib	2	573	February 3, 2023
Episode Reward Drops Without Recovery RLlib	0	172	November 9, 2023
Constant episode_reward_mean over training, even setting horizon parameter RLlib	3	32	December 5, 2024

When run PPO,it can not calculate episode reward

Related topics