Hi,
I am using
- kuberay-operator-1.2.2
which uses - ray-project/ray 2.9.0
I have seen this compatibility matrix: upgrade-guide.html#kuberay-ray-compatibility
I want to train on a custom environment written in Python, my pip requirements specify:
ray[rllib]==2.9.0
ray[default]==2.9.0
However I am encountering compatibility issues when I submit jobs:
[2025-02-26 01:41:08,060 E 134 134] (raylet) worker_pool.cc:565: Some workers of the worker process(159878) have not registered within the timeout. The process is dead, probably it crashed during start.
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/_private/workers/default_worker.py", line 210, in <module>
ray_params = RayParams(
TypeError: __init__() got an unexpected keyword argument 'node_id'
I found this commit which adds the node_id as mandatory parameter:
The commit seems to be in tags ray-2.11.0 to ray-2.42.1.
This confuses me because this change should not be in my setup.
Can you help me how I would analyze and fix this?