Hi, sorry for the confusion here! You don’t need to specify ray.remote when using Serve deployments, our @serve.deployment wrapper handles that for you.
Check here for how to specify GPUs: Core API: Deployments — Ray 1.12.0
@serve.deployment(name="deployment1", ray_actor_options={"num_gpus": 0.5})
def func(*args):
return do_something_with_my_gpu()
And does the following batching tutorial help for the batching question? Batching Tutorial — Ray 1.12.0