Serve huggingface transformer on GPU with batching

architkulkarni · April 25, 2022, 7:03pm

Hi, sorry for the confusion here! You don’t need to specify ray.remote when using Serve deployments, our @serve.deployment wrapper handles that for you.

Check here for how to specify GPUs: Core API: Deployments — Ray 1.12.0

@serve.deployment(name="deployment1", ray_actor_options={"num_gpus": 0.5})
def func(*args):
    return do_something_with_my_gpu()

And does the following batching tutorial help for the batching question? Batching Tutorial — Ray 1.12.0

Topic		Replies	Views
Ray Questions( dynamic remotify + num_cpus for remote func that calls remote funcs) Ray Core	2	324	December 27, 2020
Pure ray client management without init()	2	121	May 2, 2024
How to enable @ray.remote decorator programmatically Ray Core	3	753	April 5, 2023
Difference between serve run and serve deploy commands Ray Clusters	0	269	January 23, 2024
How to get Streaming output functions well? Ray Serve	3	867	April 12, 2023

Serve huggingface transformer on GPU with batching

Related topics