How severe does this issue affect your experience of using Ray?
- None: Just asking a question out of curiosity
Hi all, while perusing the config.h file here ray/src/ray/common/ray_config_def.h at ray-2.3.0 · ray-project/ray · GitHub
I stumbled upon this environment flag RAY_task_oom_retries
which is set to -1 by default. I am confused as to the implications of this flag on actor restarts.
- In the description this seems to imply that when an actor is killed by the memory monitor due to OOM (hence the actor also dies) dose this not count as a restart in the max_restarts? In which case I think this is not documented in the actor fault tolerance or actor page on the Ray docs because I was under the impression that any actor restart will decrease the max_restarts counter.