Hi,
What would be the best way to run a script in parallel on all (or subset) of current worker nodes ?
Thanks
Hi,
What would be the best way to run a script in parallel on all (or subset) of current worker nodes ?
Thanks
Assuming you meant a python job (ray app), then try placement group strategy=“STRICT_SPREAD”. For details check out placement group.
Thanks for answering! However, I meant literally running same script at all nodes i.e. mpirun style (or slurm sbatch style)