Hi doyen and welcome to the Ray community!
Since it connects okay without issues once Cloudwatch is removed I’m guessing it’s an issue with the integration between Cloudwatch <> Ray <> AWS?
- Does your Ray cluster have the proper IAM permissions from AWS to talk to the Cloudwatch instance? Mostly create/write permissions to allow nodes to log to CloudWatch.
- Are there any network or firewall restrictions to Cloudwatch?
- Does the security group related with Ray allow ssh access and the key pairing is working?
Are there any error messages or is it just stuck with waiting-for-ssh
?
Also, is there any code where we can reproduce this issue? Can you paste your updated YAML config (make sure you censor out any sensitive info tho)!
Here’s a few other folks who have run into this issue (albeit not with CloudWatch specifically), maybe it can help debug. Sorry I wasn’t more help!