Does ray-llm model scaling config support only CPU?
or GPU is a must to run Aviary model ?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Ray Serve LLM example in document cannot work | 6 | 474 | April 3, 2025 | |
| Ray Serve vLLM multiple models per GPU in tensor parallelism | 1 | 444 | August 14, 2025 | |
| vLLM Inferencing on multiGPU | 7 | 1438 | September 24, 2024 | |
| Serving LLM with multiple gpus | 0 | 352 | July 3, 2024 | |
| Ray Serve LLM APIs has 2~3x higher latency | 7 | 411 | May 19, 2025 |