Lies, damned lies and benchmarks. So goes an old industry joke setting up an ascending order of offenses to the truth. Old joke but alive and well in the latest industry trend: the recourse to multicore processors in our PCs.
Here, multicore means several processor modules (cores) on the same CPU (Central Processing Unit) chip, as opposed to multiprocessors, several separate chips inside the same computer. This means more computing power inside our computers, this must be good.
Not so fast. Yes, more raw power but do we know how much extra performance percolates to the surface of our user experience? Not as much as we’re led to believe.
Why this sudden conversion to multicores? The simple answer is Moore’s Law stopped working the way it did for almost 40 years; Moore’s Law used to predict a doubling every 18 months for the price/performance ratio of silicon chips. As expected, in about twenty years, we went from 1 MHz (the frequency at which the CPU processes instructions) for the Apple II, to 3 GHz (3,000 times faster) Intel chips — for about the same price. But, in the last few years, something happened: the clock frequency of top-of-the-line chips got stuck around 3 GHz. This didn’t happen because silicon technology stopped improving, we now speak of silicon building blocks as small as 35 nanometers (billionths of a meter) or even smaller in pre-production labs. A few years ago, we were happy with 120 nm or larger. So, the surface of things looks good: we still know how to cram more and more logic elements on a chip. But we have trouble making them run faster. Why?
Here easy basic physics come in. Let’s say I want to move a one gram mass up and down once, this will require a small amount of energy, say one Joule. If I repeat this once per second, we have one Joule per second, this is known as one Watt. Moving to 1,000 times a second, we’re now dealing with a Kilowatt. If the frequency moves to 1 GHz, one billion times per second, we need one Gigawatt. Going back to chips, they move electrons back and forth as the processor clock ticks. You see where I’m going: the electric power consumed by a chip climbs with the clock frequency. At the same time, the basic silicon elements kept shrinking. More and more electric power in smaller and smaller devices. One Intel scientist joked seriously that processors could become as hot as the inside of a nuclear reactor.
Back to our machines, we have desktop processors that dissipate as much as 150 Watts and require a liquid cooling element right on top of the chip. And we all complain our laptops are too hot for our… laps.
But now, imagine the computer industry calmly folding its arms and telling us: That’s all folks, this is as good as it gets. This after decades of more/faster/cheaper? No. That’s why our Valley is now peddling multicores. We can’t have faster processors (this is mostly left unsaid), let’s have more of them. And look at the benchmarks, more power than ever. This is where the question of performance delivered to the user versus raw power comes in.
First, 1+1 doesn’t equal 2. Simply because the two processors sometimes have to contend for a single resource such as memory. One processor must wait for the other to finish before proceeding. More cores, more such losses.
Second, and much more serious, most of today’s software has been written with a single processor in mind. There is no easy mechanism, either in the processors themselves, or the operating system, or the program itself to split code modules off and direct them to one processor or another. The situation is getting better as operating systems learn, at least, to dispatch ancillary, housekeeping functions to another module, leaving more computing power available to a program that only knows how to work on a single processor. And program themselves are slowly but surely being updated to split off modules that work independently. Sometimes it requires much programmer intervention, read time and money. In other cases, automated tools restructure some or most of the code. Still, today’s PC software is far from taking advantage of multicores. Hence the reference to benchmark painting an unrealistic picture of multicore performance in the real application software world.
And, third, there is yet another fly in the benchmark. Some activities are inherently parallelizable: ten people will look on ten library shelves for a single book faster (statistically) than a single person. Four people will definitely paint four walls faster than a lone painter (assuming no contention for a single paint bucket, see above). But other activities are inherently sequential: you must wait for the result of the previous operation before proceeding with the next. One can think of spreadsheets where a complex, real-world financial model cannot be computed in independent parts, each operation feeds the next until all the formulae have been computed and, in some cases, iterated. There are many such applications, weather simulation being one, because it relies on a type of equations that cannot be made to compute in parallel. As you can imagine, there is a whole body of computer science dedicated to parallelism. Let’s just say there is no real substitute for Gigahertz, for faster chips. That’s one of the reasons why weather forecasting hasn’t made much progress recently.
Multicores are nice, they do add some performance, but they’re only a band-aid until we find a way to make faster chips. — JLG