How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hello~ I write a simple python script with invokes multiple ray tasks and ray.get()
them to wait for completion. Inside the ray task, I use python multiprocessing.Pool
for concurrency.
import multiprocessing
import ray
import test
ray.init()
def reduce(i):
print("I'm reducer", i)
@ray.remote
def foo(i):
with multiprocessing.Pool(2) as p:
p.map(test.reduce, [i for i in range(2)])
def rayjob():
results = ray.get([foo.options(num_cpus=2).remote(i) for i in range(86)])
if __name__ == "__main__":
rayjob()
Lots of errors are printed:
2023-08-14 11:56:44,485 INFO worker.py:1612 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265
(foo pid=100162) I'm reducer 0
(foo pid=100162) I'm reducer 1
(foo pid=100166) I'm reducerI'm reducer 01
(foo pid=100166)
(foo pid=100162) I'm reducer 0I'm reducer
(foo pid=100162) I'm reducer I'm reducer0
(foo pid=100169) Traceback (most recent call last):
(foo pid=100169) File "simple.py", line 12, in foo
(foo pid=100169) with multiprocessing.Pool(2) as p:
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/multiprocessing/context.py", line 119, in Pool
(foo pid=100169) return Pool(processes, initializer, initargs, maxtasksperchild,
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/multiprocessing/pool.py", line 212, in __init__
(foo pid=100169) self._repopulate_pool()
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool
(foo pid=100169) return self._repopulate_pool_static(self._ctx, self.Process,
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static
(foo pid=100169) w.start()
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/multiprocessing/process.py", line 121, in start
(foo pid=100169) self._popen = self._Popen(self)
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/multiprocessing/context.py", line 277, in _Popen
(foo pid=100169) return Popen(process_obj)
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
(foo pid=100169) self._launch(process_obj)
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/multiprocessing/popen_fork.py", line 77, in _launch
(foo pid=100169) os._exit(code)
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_private/worker.py", line 776, in sigterm_handler
(foo pid=100169) sys.exit(1)
(foo pid=100169) SystemExit: 1
(foo pid=100169)
(foo pid=100169) During handling of the above exception, another exception occurred:
(foo pid=100169)
(foo pid=100169) Traceback (most recent call last):
(foo pid=100169) File "python/ray/_raylet.pyx", line 1418, in ray._raylet.execute_task
(foo pid=100169) File "python/ray/_raylet.pyx", line 1501, in ray._raylet.execute_task
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_private/worker.py", line 569, in record_task_log_end
(foo pid=100169) self.core_worker.record_task_log_end(
(foo pid=100169) AttributeError: 'Worker' object has no attribute 'core_worker'
(foo pid=100169)
(foo pid=100169) During handling of the above exception, another exception occurred:
(foo pid=100169)
(foo pid=100169) Traceback (most recent call last):
(foo pid=100169) File "python/ray/_raylet.pyx", line 1787, in ray._raylet.task_execution_handler
(foo pid=100169) File "python/ray/_raylet.pyx", line 1684, in ray._raylet.execute_task_with_cancellation_handler
(foo pid=100169) File "python/ray/_raylet.pyx", line 1366, in ray._raylet.execute_task
(foo pid=100169) File "python/ray/_raylet.pyx", line 1367, in ray._raylet.execute_task
(foo pid=100169) File "python/ray/_raylet.pyx", line 1583, in ray._raylet.execute_task
(foo pid=100169) File "python/ray/_raylet.pyx", line 813, in ray._raylet.store_task_errors
(foo pid=100169) AttributeError: 'Worker' object has no attribute 'core_worker'
(foo pid=100169)
(foo pid=100169) During handling of the above exception, another exception occurred:
(foo pid=100169)
(foo pid=100169) Traceback (most recent call last):
(foo pid=100169) File "python/ray/_raylet.pyx", line 1824, in ray._raylet.task_execution_handler
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_private/utils.py", line 174, in push_error_to_driver
(foo pid=100169) worker.core_worker.push_error(job_id, error_type, message, time.time())
(foo pid=100169) AttributeError: 'Worker' object has no attribute 'core_worker'
(foo pid=100169) Exception ignored in: 'ray._raylet.task_execution_handler'
(foo pid=100169) Traceback (most recent call last):
(foo pid=100169) File "python/ray/_raylet.pyx", line 1824, in ray._raylet.task_execution_handler
(foo pid=100169) File "/home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_private/utils.py", line 174, in push_error_to_driver
(foo pid=100169) worker.core_worker.push_error(job_id, error_type, message, time.time())
(foo pid=100169) [2023-08-14 11:56:45,900 C 100890 100169] direct_actor_transport.cc:201: Check failed: objects_valid
(foo pid=100169) *** StackTrace Information ***
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0xe63c3a) [0x7f22a6b8bc3a] ray::operator<<()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0xe65722) [0x7f22a6b8d722] ray::SpdLogMessage::Flush()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(_ZN3ray6RayLogD1Ev+0x37) [0x7f22a6b8da37] ray::RayLog::~RayLog()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0x7384da) [0x7f22a64604da] std::_Function_handler<>::_M_invoke()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0x74b34e) [0x7f22a647334e] ray::core::InboundRequest::Accept()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0x71ef90) [0x7f22a6446f90] ray::core::NormalSchedulingQueue::ScheduleRequests()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0xa016d6) [0x7f22a67296d6] EventTracker::RecordExecution()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0x99e5ee) [0x7f22a66c65ee] std::_Function_handler<>::_M_invoke()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0x99eb46) [0x7f22a66c6b46] boost::asio::detail::completion_handler<>::do_complete()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0xf503db) [0x7f22a6c783db] boost::asio::detail::scheduler::do_run_one()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0xf51ea9) [0x7f22a6c79ea9] boost::asio::detail::scheduler::run()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(+0xf52362) [0x7f22a6c7a362] boost::asio::io_context::run()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(_ZN3ray4core10CoreWorker20RunTaskExecutionLoopEv+0x1c) [0x7f22a64023ec] ray::core::CoreWorker::RunTaskExecutionLoop()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(_ZN3ray4core21CoreWorkerProcessImpl26RunWorkerTaskExecutionLoopEv+0x8c) [0x7f22a644233c] ray::core::CoreWorkerProcessImpl::RunWorkerTaskExecutionLoop()
(foo pid=100169) /home/raytong/.conda/envs/py38/lib/python3.8/site-packages/ray/_raylet.so(_ZN3ray4core17CoreWorkerProcess20RunTaskExecutionLoopEv+0x1d) [0x7f22a64424ed] ray::core::CoreWorkerProcess::RunTaskExecutionLoop()
(foo pid=100169) ray::IDLE() [0x4ecaf4] method_vectorcall_NOARGS
(foo pid=100169) ray::IDLE(_PyEval_EvalFrameDefault+0x6b2) [0x4d8762] _PyEval_EvalFrameDefault
(foo pid=100169) ray::IDLE(_PyFunction_Vectorcall+0x106) [0x4e8116] _PyFunction_Vectorcall
(foo pid=100169) ray::IDLE(_PyEval_EvalFrameDefault+0x6b2) [0x4d8762] _PyEval_EvalFrameDefault
(foo pid=100169) ray::IDLE(_PyEval_EvalCodeWithName+0x2f1) [0x4d7071] _PyEval_EvalCodeWithName
(foo pid=100169) ray::IDLE(PyEval_EvalCodeEx+0x39) [0x585b39] PyEval_EvalCodeEx
(foo pid=100169) ray::IDLE(PyEval_EvalCode+0x1b) [0x585afb] PyEval_EvalCode
(foo pid=100169) ray::IDLE() [0x5a58d1] run_eval_code_obj
(foo pid=100169) ray::IDLE() [0x5a48df] run_mod
(foo pid=100169) ray::IDLE() [0x45c502] pyrun_file
(foo pid=100169) ray::IDLE(PyRun_SimpleFileExFlags+0x340) [0x45c0a3] PyRun_SimpleFileExFlags
(foo pid=100169) ray::IDLE() [0x44fe94] Py_RunMain.cold
(foo pid=100169) ray::IDLE(Py_BytesMain+0x39) [0x579b99] Py_BytesMain
(foo pid=100169) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f22a7cb6e40] __libc_start_main
(foo pid=100169) ray::IDLE() [0x579a4d]
I search for some information about Ray with multiprocessing. I know it’s not a best practice to use Ray with multiprocessing. However, sometimes the code inside Ray task is written by others, which I can’t modify, and their code calls multiprocessing.
Thanks for any help