Hi,
I’m trying to submit jobs to a AWS ray cluster with t3.medium head node and 6 workers.
Jobs are getting succeeded as expected but after 250 jobs, All the jobs are failing with message -
Unexpected error occurred: The actor died unexpectedly before finishing the task.
Although, later 250 Jobs are getting succeeded after restarting the cluster.
Each job of mine includes an Actor and 7 tasks called remotely and results returned using ray.get().
Is there any way to resolve this.