Issues creating cluster on AWS

I am experimenting with Ray. So far I have created a “cluster” on my local PC , and ran a python script against it. I’m now trying to set up a cluster on AWS and see if I can do the same with that. But I seem to be hitting numerous issues, first on my windows PC, then I tried on the Cloudshell on AWS. I’m following the A Step-by-Step Guide to Scaling Your First Python Application in the Cloud on medium.com. Using the example YAML and python code given there. So, when I run the ray up cluster.yaml command. I get the following error message on screen at the bottom.

FileNotFoundError: [Errno 2] No such file or directory: ‘rsync’: 'rsync’

On AWS, an m5.large EC2 instance gets created and is running. When I then try and run the sample python file I get this output:-

ray submit cluster.yaml step_1.py

2022-01-18 13:51:22,271 INFO util.py:282 – setting max workers for head node type to 0
2022-01-18 13:51:22,271 INFO util.py:286 – setting max workers for ray.worker.default to 1
Loaded cached provider configuration
If you experience issues with the cloud provider, try re-running the command with --no-config-cache.
Head node (i-046c026f92805d424) is in state update-failed.
Traceback (most recent call last):
File “/home/cloudshell-user/.local/bin/ray”, line 8, in
sys.exit(main())
File “/home/cloudshell-user/.local/lib/python3.7/site-packages/ray/scripts/scripts.py”, line 1989, in main
return cli()
File “/usr/local/lib/python3.7/site-packages/click/core.py”, line 829, in call
return self.main(*args, **kwargs)
File “/usr/local/lib/python3.7/site-packages/click/core.py”, line 782, in main
rv = self.invoke(ctx)
File “/usr/local/lib/python3.7/site-packages/click/core.py”, line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File “/usr/local/lib/python3.7/site-packages/click/core.py”, line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File “/usr/local/lib/python3.7/site-packages/click/core.py”, line 610, in invoke
return callback(*args, **kwargs)
File “/home/cloudshell-user/.local/lib/python3.7/site-packages/ray/scripts/scripts.py”, line 1274, in submit
down=False)
File “/home/cloudshell-user/.local/lib/python3.7/site-packages/ray/autoscaler/_private/commands.py”, line 1127, in rsync
config, config_file, override_cluster_name, create_if_needed=False)
File “/home/cloudshell-user/.local/lib/python3.7/site-packages/ray/autoscaler/_private/commands.py”, line 1258, in _get_running_head_node
config[“cluster_name”]))
RuntimeError: Head node of cluster (basic-ray) not found!

Any help would be greatly appreciated

Can you install rsync on your machine where you’re running ray up?

Hmmm …

[cloudshell-user@ip-10-1-14-60 ~]$ sudo yum install fsync
Loaded plugins: ovl, priorities
amzn2-core | 3.7 kB 00:00:00
No package fsync available.
Error: Nothing to do

Sorry my mistake , it’s rsync of course not fsync. Now have it installed on cloudshell and will try and create the cluster once more

An update Richard.

I managed to get it working for me on Cloudshell which is great. Thanks for your suggestion on rsync. I don’t suppose you know of any way to do something similar using Windows as that’s the system I’m most familiar with. AFAIK rsync is not a Windows command.