Did you know every one of your rack-mount servers has a literal turbo button, and you probably haven't pressed it?
Back in the 1980s the very popular Intel 8086 and 8088 processors had a standard clock frequency of 4.77 MHz. This was consistent long enough that many games and even some device drivers were written to rely on this specific timing. In principle, the developers should have used the real time clock (RTC) like modern games do, but obtaining the real time back then was expensive, so a common “optimisation” was to rely on the CPU frequency only. Progress inevitably marched on and processor frequencies increased. Since game speeds were tied to the frequency, they sped up like a video tape on fast forward. Most such games became essentially unplayable. For the sake of backwards compatibility, PC manufacturers added a “Turbo” button on the front of the case that could be used to switch between the legacy 4.77 MHz frequency and whatever speed the ugpraded processors were actually capable of.
I have fond memories from my childhood of visiting friends and pressing the turbo button on their PCs. The shock on their faces when the computer they've had for years magically sped up by a factor of two or more was just priceless. There were a few holdouts however. I vividly remember one friend getting very upset that I had done something “bad” to his PC and he insisted that I undo “whatever I had done”. I quickly switched the turbo off because it looked like he was about to cry any second.
It's 2020 now – the actual future – and nothing has changed. There still a turbo button on all Intel-based PCs and servers, except instead of a physical button, it's all software controlled. Unfortunately, it's off by default, so I still get to run around gleefully pressing it. I still see many people hesitate to approve the change, despite the clear-cut benefits that are simply too good to ignore.
Largely, this is the “fault” of green computing initiatives such as the United States federal ENERGY Star program, as well as various local programs in Europe and California. Because so many jurisdictions require some form of energy-efficiency features, all manufacturers enable these features by default. They come by various names now, but it's most commonly referred to as ACPI P-States and C-States for CPUs, and generically “Power Management” or something similar for all other devices.
At first, this wasn't that big a deal. In the early 2000s Intel-based servers had only the SpeedStep feature, which reduced clocked speeds but kept all CPU cores operational at all times. Idle servers would use less power, but quickly returned to full speed if they had work to do. Also, server networking back then was typically 100 Mbps Ethernet and all disks where mechanical, so the additional wakeup delays were not that big a deal. Turning off the green power saving mode just increased your electricity bill and didn't do much of anything for application performance.
Things have changed since then: Server networking starts at 10 Gbps and SSDs with read latencies of a few tens of microseconds are commonplace. The wakeup time from a processor's deep sleep state is now an eterenity compared to the speed of the rest of the data centre environment. In principle, most server CPUs can switch between the various power-saving states in as little as a microsecond, but the real problem is with some newer ACPI C-States that allow processor cores to be entirely powered off. If a workload is assigned to such a core, then the wakeup time can be much higher. These wakeup times from deep sleep states are not well advertised, but can be on the order of a hundred microseconds or more. Additionally, no matter how fast the wakeup time, C3 and higher sleep states completely turn off the processor core, which forces its caches to flush and be reinitialised and resynchronized to the other cores when woken. This inherently takes significant time, no matter what. Additionally, most such power saving systems will not switch cores directly from deep sleep to the maximum clock speed, but will incrementally “ramp up” to faster and faster clock speeds.
In practice, wakeup times can be 50μs-200μs, and I regularly see server processors “stuck” at clock speeds as low as 1 GHz all day.
My experience in the field is that turning off green power saving features can occsassionally double or even triple application performance, with a minimum expected boost of 20% in almost all cases. This is a an email that was sent to an IT manager at a customer where disabling power management was the only change made:
@@@
The typical improvement is about 30% and tends to be higher on servers with lighter workloads. Modern, high core-count virtualised cluster hosts tend to be ideal candidates for seeing large improvements, whereas overloaded servers see little or no benefit – but in that case power management does not save any power, so turning it off has no down side.
Look at it this way: Are you happy that you've purchased millions of dollars worth of server equipment, just to have nearly half of its capability thrown out the window for the want of a button press?
There's some “downsides” that aren't:
So what are the real downsides? There's only a few:
A common misconception is to expect CPU power management to affect CPU-intensive workloads such as batch processing or reporting workloads, but the exact opposite is true! A single expensive query or job will quickly spin up all CPU cores to their maximum speed and they will stay there the whole time. The green power-saving features have a totally negligible effect, adding microseconds to processes taking seconds or minutes.
Server-to-server networking is a very different story. I discovered ACPI-C states and their dramatic effects on performance around the time that 10 Gbps Ethernet starting seeing widespread adoption. It was often slower in practice than the 1 Gbps networking replaced! This was completely counter-intuitive, and I saw many large enterprise customers roll out millions of dollars worth of kit only to see a reduction in throughput. Even simple file copies that used to speed along at “wire rate” on 1 Gbps would drop to as little as 600 Mbps on theoretically 10x faster equipment!
The “lightbulb moment” for me was when I discovered that a simple busy loop along the lines of a while(true) {}
snippet in a script would increase network throughput. I used to run this demo for clients to show how forcing the CPU clock higher with busywork would improve things.
@@@ 10 Gbps is worse than 1 Gbps! @@@ timeline of packets back and forth
@@@ table of Dell, HPE, etc… Static Max Power @@@ VMware ACPI-C state screenshot @@@ Windows power management for Hyper-V or bare metal
@@@ Microseconds matter @@@ Brent ozoar's article @@@ VMware latency sensitivity tuning