Tensor parallelism with torch run inside ray

nivi_billa · April 29, 2024, 7:25pm

Hi,

Im using the gpt-fast library, and im doing multi gpu inference like this.

torchrun --standalone --nproc_per_node=8 generate.py --checkpoint_path llama-3-70b-instruct-hf-pt/model.pth

How can I do this across multiple nodes inside ray?

Topic		Replies	Views
Ray multiprocessing with multi pytorch model inference Ray Core	1	559	October 18, 2023
torch.nn.DataParallel with tune.run() Ray Tune	1	757	June 28, 2022
Ray.tune with pytorch: only uses 1 of 4 GPUs	1	313	May 15, 2023
Using ray with transformers pipeline for inference Ray Core	0	508	August 19, 2021
Parallel inference using CPUs Ray Core	2	833	July 7, 2023