How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
My question is: How to define num_gpus
in ray.remote()
while not explicitly adding @ray.remote
above the target class?
I’m trying to use ray to accelerate my federated learning training task. Here are some codes about the implementation:
class FLbenchTrainer:
def __init__(
self, server, client_cls, mode: str, num_workers: int, init_args: Dict
):
self.server = server
self.mode = mode
self.num_workers = num_workers
if self.mode == "serial":
self.worker = client_cls(**init_args)
elif self.mode == "parallel":
ray_client = ray.remote(client_cls)
self.workers: List[ActorHandle] = [
ray_client.remote(**init_args) for _ in range(self.num_workers)
]
For some reasons, I don’t want to add @ray.remote
decorator above all the client_cls
in my project explicitly. Because sometimes I need my code run in serial. So I callray.remote(client_cls)
to get the Ray Actor.
Since the num_gpus
is not set, when my num_workers > 1
(equals to the num of ray actors), no actor can utilize the gpu resource properly.
I know I need to set num_gpus = NUM_MY_GPU / num_workers
to allocate actors their own gpu resource. But like I said, the num_workers
is not fixed and sometimes I don’t want ray activated, I don’t want to add @ray.remote
explicitly. So what should I do?