I’m running a ray cluster on a slurm cluster. I’m looking for a way to find where jobs are actually running. ray status
returns the total set of nodes and the number of cpus that are currently active. I would like to remove the inactive nodes from my ray cluster but first I need to know which nodes are inactive. I would like to be able to do this programmatically. Many thanks for any response.
If anyone is ever interested the command was:
ray list tasks --address=“headnode_ip”