I am using ray to run two functions in parallel,i get error message"Worker unexpectedly exits with a connection error code 2"The process killed by SGKILL by OOM killer due to high memory
check python-core-worker-*.log. , where do i find this log file ? i am working in a anaconda environment
What version of Ray are you running? Ray 2.0 has
ray logs which shows all the files available, including the
@Clarence_Ng knows more about dealing with OOMs, is there anything else to be done?
Also, it’s expected that Ray will kill a task that is consuming too much heap memory. Otherwise, it will cause the machine to crash or experience severe performance degradation. What is the workload you are running? Do you expect it to consume all heap memory available?
@mlguy89 thanks for filing the report. Typically the worker crash also contains stacktrace of the error - does it have anything interesting / useful ?
The log file by default lives in /tmp/ray/session_latest/logs for the latest cluster, let us know if the documentation isn’t clear or where you think needs improvement : Logging — Ray 2.0.1
@mlguy89 did you get to try finding the logs?
After reducing one feature don’t face any memory issue