Performance-impact of Hyper-Threading

performance intel-core-i7 intel-core-i5 x86 hyper-threading

22,137

Solution 1

It is likely not a measurement error. In fact, this is an eternal debate on the performance of games, since they are usually designed to have the maximum amount of single-core performance. According to this article from Intel article from Intel the Hyperthreading is:

Hyper-Threading Technology from Intel allows one physical processor package to be perceived as two separate logical processors within the operating system. Processor resources enabled for Hyper-Threading Technology duplicate, tag, or share the majority of resources. Sharing resources allows a more efficient use of the processor for a significant performance increase, at less than 5% die size and power consumption increase compared to a single processor package. However, Hyper-Threading Technology cannot have performance expectations equivalent to that of multiprocessing where all the processor resources are replicated.

In the table that you have shown, Cinebench tests one single core of the processor. In short, HT (HyperThreading) enables two virtual cores for one physical core (the one that will be evaluated in the test). If the test is based on launching a single process that does not need to be divided, sharing resources between two cores degrades the test result, since the balance that occurs when it's active doesn't happen when it's disabled (Windows and Cinebench only see a single processor).

If we add another test from Tom's Hardware to compare it with the table you have shown (Cinebench R11.5):

And multi-threaded:

The results on single-thread performance are not so different from the ones that you have shown in your page. It is important to note that the two logical processors that have separate execution states share resources such as the system bus or cache so they can not always parallelize the tasks, and it can happens sometimes thread stalling mentioned in this article that means that in the single-thread stress test, the resource sharing could tend to enqueuing some threads delivering a slightly worse performance result.

You can also see here how different scenarios in different games in the article of overclock.net were the results claims that in some cases the performance is hurt. I do not believe that this has to be taken as "disable HT improves the single-thread performance" but as "the game is optimized for a maximum of 4-cores" or "is not taking advantage of the HT". The first assumption can be validated reading some articles like this, which shows how the single-core performance of an i3 improves the performance if the HT is enabled comparing with i7 that it doesn't.

To sum up, we have seen that there are small cases that disabling HyperThreading has minimal improvements over the single thread performance, but the overall cost-benefit ratio it isn't enough to claim disabling HyperThreading. As far as the OS and the software it is designed to HT architecture, it is not worth to disable it.

Solution 2

Yes and it should be obvious. When you enable HT you advertise twice as many cores as there are.

This is designed to let more parallelization happen on the basis that most programs are not sufficiently multi-threaded. However, if you fully multi-thread a program, then you overcommit resources and there is a performance drop just because of the extra overhead per thread. However small this may be, with an application than managed to use 100% CPU over any number of cores and processor, enabling HT resulted in a roughly 2-3% drop in performance.

Now in the case of an isolated single-threaded program, it sounds like it should not matter since the program itself cannot overuse resources but remember than the OS also thinks there are extra cores and that can overcommit resources. Even if there are still unused cores, one can measure overhead caused by the scheduler which does not optimally place the thread and lock it to a single real core.

These observations are based on over a decade of real-time software development and benchmarks. There is clearly an observable difference, although a very small one, when one tries to maximize the performance of a system.

22,137

Bonita Montero

Updated on September 18, 2022

Comments

Bonita Montero over 1 year

I just read an article on Heise Online (look at the table, the rest is German) which claimed, that Hyper-Threading slows down single-threaded programs although they don't use the second thread of a core. I.e. if you disable HT in the BIOS, the single-threaded app runs slightly faster.

Is this true or is this a measurement-error? Does anyone has sources about benchmarks which assert the same?
jgorostegui over 7 years

Updated answer.
metacollin about 4 years

You managed to get almost every important point about HT completely wrong. First, adding more cores, virtual or otherwise, doesn't magically make programs which are not very multithreaded to become more so, anymore than adding more physical cores does. Secondly, multithreading overhead is entirely dependent on the program and implementation, and regardless, is totally irrelevant to HT. You seem to have a very deep misunderstanding of how HT works.
metacollin about 4 years

HT doesn't result in 'over committing' resources. HT is giving each execution core two architectural state units (pipeline, decoder, interrupts, registers, every thing an entire extra physical core would have, except for the execution core itself). Since any HT'd CPUs are superscalar, this means they can reorder instructions and any time there is data specific to each logical core that needs the same operation performed, that instruction can be executed on both thread's data simultaneously.
metacollin about 4 years

This obviously will have anywhere from a negligible to substantial improvement in performance depending on how often two threads both need to perform the same operation on different data. But its important to note that any modern OS and CPU isn't 'over provisioning' anything. Each thread gets its own core, and it is only when the number of threads exceeds the physical cores that logical cores come into play. At this point, there are no resources left anyway, but at least you can combine certain operations to use each execution core to do that much more per instruction.
metacollin about 4 years

But, arguably the main benefit from HT is that it results in a profound reduction in pipeline stalls (when the execution core of a CPU is sitting idle because it is waiting on data that wasn't in the L1 cache to get loaded from L2/L3/or worst case, even system RAM). Without HT, all that time is simply wasted. With HT, the moment one pipeline stalls, guess what - there is that entire second pipeline as well. So if either thread ends up waiting on data, then if the other thread isn't, the core can execute those instructions. Every instruction completed is one more than no HT could have done.
metacollin about 4 years

Finally, HT for the same reason helps mitigate some of the performance lost from branch prediction misses. CPUs use speculative branch prediction, meaning they prefetch stuff on the assumption that a certain if statement or whatever will go a certain way. If it guesses wrong, you have to wait for the pipeline to fill up again, causing several cycles of wasted execution time. With HT, whenever branch prediction guesses wrong, it has that other pipeline to execute instructions from while the other one refits.
metacollin about 4 years

Also, your observations are, ultimately, little more than anecdotes and it doesn't matter how many years of them you have. A mountain of real world benchmarks, measurements, and data comparing HT on vs off, done using an incredible variety of tasks, is one google search away. And it overwhelmingly contradicts your anecdotes. And no offense, but using anecdotes when others have actual data... you're bringing a knife to a gunfight. Your observations don't even mention what OS, the workload, the number of cores. Because what you describe depends on all that, it isn't generalized to HT.
metacollin about 4 years

Beyond all that, real-time software development is about minimizing latency and maintaining responsiveness. That hardly makes you the expert on HPC (high performance computing)/number crunching. I'm sure if you limit yourself to a strict subset of tasks that depend on real time performance, HT loses any advantage. Indeed, real time applications recommend turning HT off. But those are by far the minority of tasks computers are asked to do.
metacollin about 4 years

And here is the proof. Check the third and fourth page too. These are common tasks, and of all the tasks tested, only one was actually slower with HT turned on, and the difference was about 2.5%. Every single other task tested saw modest to significant (30%+) improvement from HT. You have to be doing very specific types of tasks, and only those tasks, before it ever makes sense to turn HT off. phoronix.com/…