It seems likely that your issue is due to the Ray worker subprocesses not having the same Python environment (including third-party libraries like numpy) as your main process. When Ray schedules remote tasks or actors, the worker processes must be able to import and use all required libraries. If the Ray cluster is started in a subprocess with a different environment, or if numpy is not installed/imported in that subprocess, serialization/deserialization of numpy arrays will fail, causing timeouts or errors in remote calls involving numpy data. This is a common pitfall when using Ray with subprocesses or pyinstaller-compiled executables, especially on Windows, as Ray expects a consistent Python environment across all processes and nodes (see related discussion).
My understanding is that to resolve this, you should ensure that the Ray worker subprocess is started with the same environment as your main process, and that all required libraries (like numpy) are installed and importable in that environment. You may also need to explicitly import numpy in the subprocess before starting Ray, so that the worker processes inherit the correct module state. Alternatively, consider using Ray’s runtime_env feature to specify dependencies, though this may have limitations on Windows. Running Ray in subprocesses or from pyinstaller executables is not a well-supported or tested pattern, and behaviors may be undefined (see Ray team response).
Would you like more detail or a step-by-step breakdown?
Sources:
Hint: Mention @RunLLM in the post for followups.