As the picture showed above, while trying to run Ray Train on Kaggle, I’m experiencing a startup error with the message: “ERROR services.py:1330 – Failed to start the dashboard, return code -11.” The system also prompts me to check the ‘dashboard.log’ or ‘dashboard.err’ for further details. I am wondering whether I’m missing a step in my setup, or if Ray is incompatible to run on Kaggle? How can I both disable the Ray dashboard to bypass this issue, and ensure my Ray application runs smoothly on Kaggle without encountering such startup errors?
btw, the log is:
(just 2 rows, and. there’s nothing in err logs)
Is it failing at the ray.init() line?
Hello @matthewdeng, I have the same issue when trying to use Ray on Kaggle. To answer your question, yes it is failing at the ray.init() line. Additionally, even you pass the option include_dashboard=False to ray.init(), it still tries to start the dashboard and throws the same error mentioned by @man_Iron. Moreover, when throwing this error it makes the notebook crash and you have to re-run all cells you had executed before (ie: all variables are deleted). Finally, it is important to note that this error is only thrown when a GPU accelerator (P-100 or T4 x2) is added to the session. In the context of pure a CPU session, ray.init() does not throw any error.
To reproduce, you can just connect to a Kaggle account, add a GPU accelerator to your session and run the following 2 lines:
import ray
ray.init()
You can even add the include_dashboard=False option and it will still throw the error mentioned by @man_Iron.
When I run those 2 lines I get the same error as @man_Iron except today, the first time I connected the P-100. I got the following, more informative, error. But I could not reproduce this error a second time.
Note: Changing the environment to the latest one does not change anything.
Thank you in advance for your help 
Looks like the issue is with grpcio package. Solved the issue by running
!pip install grpcio==1.62.2
I tried installing grpcio and while im no longer getting the same dashboard fail as @man_Iron i have been stuck at this for a while now:
2024-05-15 15:32:19,943 INFO worker.py:1540 -- Connecting to existing Ray cluster at address: 172.19.2.2:6379...