I have mixed CPU types, I have simple compute job that submit to the cluster.
I have totally 64 CPU resources. When it starts it allocates or rather queues the job in a pool. But when a machine with certain CPU type finishes early, the job is not allocated to the idling CPUs, and it is waiting for other resources to finish. It rather does not do anything automatic, it just splits the job and posts.
async_results = [pool.apply_async(launch_compute, args=(data,)) for data in data_array]
results = [ar.get() for ar in async_results]
Is there any configuration that makes the cluster more efficient and that it dynamically finds free CPUs that already finished their job and feed the pending ones to them?
Running on a bare machine works much faster, if I distribute manually using simple socket.