Hi, I have deployed three worker groups with the following differences in their resource configurations:
resources.limits.memory=2G
,resources.requests.memory=2G
, and
maxReplicas=15
.resources.limits.memory=5G
,resources.requests.memory=5G
, and
maxReplicas=10
.resources.limits.memory=10G
,resources.requests.memory=10G
, and
maxReplicas=5
.
In a dummy code, each task requests num_cpus=1
. Ideally, when executing these tasks, I expect the autoscaler to maximize the number of tasks running in parallel by first spawning pods from the first worker group, then the second, and finally the third—provided the nodes have sufficient resources.
However, the autoscaler behaves differently: it starts by spawning pods from the third worker group, then the second, and lastly the first. This results in fewer pods overall.
Is there a way to configure the autoscaler to prioritize maximizing the number of pods by first using the worker group with smaller resource requirements, as long as the tasks have sufficient resources to execute?
Thank you in advance for your insights!