I need to schedule specific actor/task to specific machines. By using the placement group and node_id as resource, I was able to achieve this purpose. However, when I run pytest with multiple programs, it hangs, maybe due to resource contention??
This is how I create the placementgroup, maybe in a unusual or even wrong way.
try:
pg = []
res_cap = {node_id: 1}. # use node_id as resource with capacity of 1
cpu_cap = {“CPU”: 1}
pg.append(res_cap)
pg.append(cpu_cap)
return placement_group(
pg, strategy=PlacementStrategy.STRICT_PACK
)
except Exception as e:
print(e)
Then I used pytest to test 2 files, both created pg the same way as above, it ran into issues that the second test function hangs there. The ray status shows a list of pending tasks/actors, which I don’t know its meaning.
Resources
Usage:
0.0/32.0 CPU
0.00/69.754 GiB memory
0.00/33.886 GiB object_store_memory
Demands:
{‘CPU_group_ed4e518ee308c183c1704fa219f35dfe’: 1.0}: 1+ pending tasks/actors
{‘CPU_group_f1aaa2d6530300eb22f3b91958bfc10e’: 1.0}: 1+ pending tasks/actors
{‘CPU_group_8eb1936673b9e77c491b9145b39d2653’: 1.0}: 1+ pending tasks/actors
{‘CPU_group_70faa0a852b14c608765e750e07fe568’: 1.0}: 1+ pending tasks/actors
{‘CPU_group_cac9fe0ad2a29d55c25d7d01ea0f0a5f’: 1.0}: 1+ pending tasks/actors
{‘CPU_group_eb09d2b20272f2aaa2e6265da0753f6d’: 1.0}: 1+ pending tasks/actors
Can I get some advise how to solve this issue and is my way of generating placementgroup correct or not? Thanks very much