How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
How to divide data freely to worker?
Now there are two models. In distributed training, data transfer is required. How to ensure that the data obtained by workers of the two models are consistent
Are you referring to data parallel training?
If you are using torch, you can take a look at ray/torch_trainer.py at master · ray-project/ray · GitHub. Similarly we have examples for other frameworks.
can I specify a data block to a specified woker?
Are you planning to use Ray Dataset?
There is indeed logic in Ray Dataset that decides which dataset block goes to which worker for distributed training. And if the remaining blocks cannot be divided evenly among workers, some logic makes sure to further divide up some blocks into smaller ones and assign them evenly to workers.
However, usually this kind of logic is not that important for end users to know about and we abstract that part away from end user. From their perspective, they have a Ray Dataset and all workers can get their shards of data evenly.
Do you mind telling us a bit more about your use case and why you would care about such details?
I want to use Ray Train for vertical federated learning, and provider and promoter workers need to accept consistent data.
I have another question, can I use TorchTrainer for the model that implements gradient update manually?
what do you mean by “consistent data”? My understanding is that the participants in VFL don’t need to share the sample space but rather feature space is split.
Could you speak more concretely about what is the concern around data consistency? Maybe something around how do you plan to map provider/promoter worker to ray actor and the data requirement. A schematic would be helpful.
As for gradient update, ray train doesn’t do any of the gradient/weight syncing, it’s all offered by torch. Again, I would be curious to learn why you would want to do that manually.