Ray performance using physical CPUs versus logical CPUs

Hi there,

I am interested in Ray performance using physical CPUs against logical CPUs. Given Debugging and Profiling — Ray v1.9.1, and particularly:

Do the machines you’re running on have fewer physical cores than logical cores? You can check the number of logical cores with psutil.cpu_count() and the number of physical cores with psutil.cpu_count(logical=False). This is common on a lot of machines and especially on EC2. For many workloads (especially numerical workloads), you often cannot expect a greater speedup than the number of physical CPUs.

I have the following question. Why can’t I expect a greater speedup than the number of physical CPUs for many workloads (especially numerical workloads)?

Thanks in advance!

Hi @YarShev!

Why can’t I expect a greater speedup than the number of physical CPUs for many workloads (especially numerical workloads)?

Because workloads such as numerical workloads are often compute-intensive, meaning that they involve mostly on-physical-core execution.

A logical core (i.e. HyperThreading) works by swapping in a thread of execution onto a physical core when the current thread on the physical core stalls to e.g. access memory or perform I/O. If all threads involve mostly on-physical-core execution and therefore are performing very little memory accesses or I/O, these threads will end up being in constant contention for the physical core, yielding very little parallelism between the threads, and maybe even degrading performance due to the overhead of these context switches.

Hi @Clark_Zinzow, thank you for the answer! That makes sense to me.