We are running some ray.tune unit tests on Linux. The test setup is like this. We only launch one ray cluster.
ray.init()
tune.run(xxxx)
ray.shutdown()
we occasionally see this error.
E ray.exceptions.RuntimeEnvSetupError: Failed to setup runtime environment.
E Could not create the actor because its associated runtime env failed to be created.
E Failed to create runtime environment {"envVars": {"TUNE_ORIG_WORKING_DIR": "xxxx”}} because the Ray agent couldn't be started due to the port conflict. See `dashboard_agent.log` for more details. To solve the problem, start Ray with a hard-coded agent port. `ray start --dashboard-agent-grpc-port [port]` and make sure the port is not used by other processes.
cc @architkulkarni I think we should retry if we find that the port number that we randomly generated is currently in-use. Is that work planned for runtime env creation // is my diagnosis correct?
We deprecated that feature from 2.2 (the feature was unmaintained for a while…). We recommend you to run ray.init() and run workloads. If you think local_mode=True is important for unit test, please file a feature request to re-enable it!
I’m running into the same issue (Failed to create runtime environment for job 01000000 because the Ray agent couldn't be started due to the port conflict. See dashboard_agent.log for more details.. Is this planned way to do this from ray.init or do I still need to call out to a seperate subprocess even for unit testing? Is there now a better way to do this? Furthermore, in my ray.init call I have set include_dashboard=False but am still getting this error. Is this expected?