Worker Timeout and restart

SebastianBo1995 · February 15, 2022, 9:23am

Hello,

I currently have the problem that the RL-Training freezes after 1-3 days. I am unsure where it comes from, but I would like to implement a timeout for the environment. So I got a couple of questions:

is there already a timeout implemented in RLLib / Ray that can be used?
is it possible to use that timeout to completely close and restart the worker?
if not is ray compatible with signals? Or with the timeout-decorator · PyPI

Thanks in advance

Regards
Sebastian

Topic		Replies	Views
Get_objects of worker.py timeout Ray Tune	3	465	June 19, 2022
Client server timeout RLlib	3	420	June 30, 2022
Confusion migrating to new API RLlib	5	214	February 21, 2025
Rollout workers spend too much time on set_weights() RLlib	1	291	November 30, 2022
My Ray programs stops learning when using distributed compute RLlib	10	1079	August 16, 2022

Worker Timeout and restart

Related topics