LLM model loading

Play_Zone · July 29, 2024, 11:42am

Hi everyone,

I have a model that requires 16GB of RAM and 8 CPUs to execute. In my cluster, I have two worker nodes, each with 8GB of RAM and 8 CPUs.

Is it possible to run that task assuming that I have an 16gb of ram combined (4gb+4gb)

Any help or guidance would be greatly appreciated!

THANK YOU!!!

Topic		Replies	Views
Does ray-llm support only CPU?	0	406	October 25, 2023
Memory Requirements for distributing LLM	3	574	August 31, 2023
Question about the model yaml config `accelerator_type_a100`	1	442	February 2, 2024
Serving LLM with multiple gpus Ray Serve	0	264	July 3, 2024
Task distribution Ray Clusters	3	30	July 29, 2024