What is the difference between LocalDependencyResolver::ResolveDependencies (in core_worker/transport/dependency_resolver.h) and DependencyManager::RequestTaskDependencies (in raylet/dependency_manager.h) ?
I know that when scheduling, we should obtain the parameters that the task depends on before allocating a worker. But what exactly is dependency resolution doing in core worker ?
@wangxingyu the responsibility of dependency resolution is split between core worker and distributed scheduler. This enables optimizations such as small-object inlining and direct worker->worker task submission, which would not be possible if all resolution was done in the raylet.
For example, consider a task f.remote(g.remote()), where both f and g return small values and are submitted from the same process. The core worker owns both, and so is able to schedule g, get its inline value return, and then schedule f on the same process with the minimal number of RPCs without any additional RPC/IPCs to raylet. For large arguments, the core worker only waits for the arg to be created, and it’s up to the distributed scheduler to fetch the large object dependency.
@ericl Thank you very much. I have understood your explanation. I have three other questions and look forward to your answer:
In your example, the owner schedules f to the same process as g. I understand that this should be the leasing policy in Ray. My question is, if the resource requirement of g is num_cpus=1, and the resource requirement of f is num_cpus=2. In this case, will core worker apply to raylet for a new worker to execute f (i.e. the normal scheduling process) due to insufficient resources for the worker executing g ?
I know that when resource requirements cannot be satisfied or time-out, the leasing mechanism will fail.
For large objects, the core worker needs to wait for them to be created. The advantage of this is that the system can use locality awareness policy in core worker. I have another idea. It seems that this policy can also be put into raylet. The advantage is that it may reduce the time waiting for the creation of dependent large objects. However, the disadvantage is that this requires an additional RPC (worker - > local raylet). Since there is no experiment, I am curious about the performance difference between the two.
I wonder how Ray handles the trade-off between locality awareness policy and load balancing (or the default scheduling policy in raylet).
In your example, the owner schedules f to the same process as g . I understand that this should be the leasing policy in Ray. My question is, if the resource requirement of g is num_cpus=1 , and the resource requirement of f is num_cpus=2 . In this case, will core worker apply to raylet for a new worker to execute f (i.e. the normal scheduling process) due to insufficient resources for the worker executing g ?
I know that when resource requirements cannot be satisfied or time-out, the leasing mechanism will fail.
Yes, in this case a new worker will be requested. We use the resource requirement as a leasing key.
For large objects, the core worker needs to wait for them to be created. The advantage of this is that the system can use locality awareness policy in core worker. I have another idea. It seems that this policy can also be put into raylet. The advantage is that it may reduce the time waiting for the creation of dependent large objects. However, the disadvantage is that this requires an additional RPC (worker - > local raylet). Since there is no experiment, I am curious about the performance difference between the two.
Hmm I think the wait time would be comparable in either case. In either way, some process is subscribing to the owner of the object being waited on (GetObjectStatus RPC). Whether the subscription is from the core worker or raylet should give comparable performance. Doing it from the core worker is faster in the case though when the owner is the same process that’s doing the wait.
I wonder how Ray handles the trade-off between locality awareness policy and load balancing (or the default scheduling policy in raylet).
Good question. Currently I believe locality always takes precedence unless you use a policy like SPREAD. cc @Clark_Zinzow and @jjyao