Ray train can't run in kaggle

Hello @matthewdeng, I have the same issue when trying to use Ray on Kaggle. To answer your question, yes it is failing at the ray.init() line. Additionally, even you pass the option include_dashboard=False to ray.init(), it still tries to start the dashboard and throws the same error mentioned by @man_Iron. Moreover, when throwing this error it makes the notebook crash and you have to re-run all cells you had executed before (ie: all variables are deleted). Finally, it is important to note that this error is only thrown when a GPU accelerator (P-100 or T4 x2) is added to the session. In the context of pure a CPU session, ray.init() does not throw any error.

To reproduce, you can just connect to a Kaggle account, add a GPU accelerator to your session and run the following 2 lines:

import ray
ray.init()

You can even add the include_dashboard=False option and it will still throw the error mentioned by @man_Iron.

When I run those 2 lines I get the same error as @man_Iron except today, the first time I connected the P-100. I got the following, more informative, error. But I could not reproduce this error a second time.

Note: Changing the environment to the latest one does not change anything.

Thank you in advance for your help :sweat_smile:

1 Like