Basic performance metrics

Published

2025-09-14

Basic performance metrics: CPU time, CPI, MIPS

When evaluating computer systems, it is tempting to focus only on the processor’s clock speed, which is typically reported in gigahertz (GHz), though some older processors and other devices operate in the megahertz (MHz) range.

The clock frequency tells us how many cycles the processor can execute per second. For example, a 3 GHz processor has a clock period of about 0.333 nanoseconds, meaning that its clock “ticks” three billion times per second. However, raw clock speed is not a complete measure of performance. Two different processors running at the same frequency, or even one running at a lower frequency, can complete a given task faster than another depending on their design and efficiency.

To understand performance, computer architects use a fundamental equation that breaks down CPU execution time into three factors:

\text{CPU time} = \text{instruction count} \times \text{CPI} \times \text{clock cycle time}

The instruction count (IC) is the number of machine instructions executed for a given program. This depends on the instruction set architecture and also on the compiler, since different compilers may generate more or fewer instructions for the same source program.
The cycles per instruction (CPI) is the average number of clock cycles each instruction requires to complete. CPI depends on the microarchitecture of the processor. Simple instructions may take one cycle, while more complex instructions can take multiple cycles.
The clock cycle time is the duration of a single cycle. It is the reciprocal of the clock frequency. A 2 GHz processor has a cycle time of 0.500 nanoseconds, while a 3 GHz processor has a cycle time of about 0.333 nanoseconds.

For example, consider two processors:

Processor A: 3 GHz clock, average CPI = 2.0
Processor B: 2 GHz clock, average CPI = 1.2

For each instruction, processor A requires 0.666 ns (2 \times 0.333 ns), while processor B requires 0.6 ns (1.2 \times 0.5 ns). Even though Processor B has a lower clock frequency, it executes each instruction faster because of its lower CPI. This example shows that clock speed alone does not determine performance. Let’s complete the calculation for 1,000,000,000 (one billion) instructions.

\begin{align*} \text{CPU time, processor A} &= 1{,}000{,}000{,}000 \times 2.0 \times (1 / 3 \text{ GHz}) \\ &= 1{,}000{,}000{,}000 \times 2.0 \times 0.333 \text{ ns} \\ &= 666{,}000{,}000 \text{ ns} = 0.666 \text{ s} \end{align*}

\begin{align*} \text{CPU time, processor B} &= 1{,}000{,}000{,}000 \times 1.2 \times (1 / 2 \text{ GHz}) \\ &= 1{,}000{,}000{,}000 \times 1.2 \times 0.500 \text{ ns} \\ &= 600{,}000{,}000 \text{ ns} = 0.600 \text{ s} \end{align*}

So in this example, processor A takes 0.666 s to perform 1,000,000,000 operations, and processor B, running at a slower clock speed, takes less: 0.600 s.

In practice, overall execution time also depends on the instruction count. A compiler that produces more efficient code may reduce the number of instructions, which can compensate for a higher CPI. Conversely, an instruction set with complex operations might reduce instruction count but increase CPI.

Another useful metric is MIPS (millions of instructions per second), which measures how many instructions a processor can execute per unit of time. (Don’t confuse this with the MIPS architecture, which was supposedly given the clever name “MIPS” because it could execute millions of instructions per second.)

\text{MIPS} = \frac{\text{clock speed}}{\text{CPI } \times 10^6}

Following the example above:

\text{MIPS, processor A} = \frac{3 \text{ GHz}}{2.0 \times 10^6} = 1{,}500 \text{ MIPS}

\text{MIPS, processor B} = \frac{2 \text{ GHz}}{1.2 \times 10^6} = 1{,}667 \text{ MIPS}

However, MIPS can be misleading because instruction complexity varies across architectures. A program that executes fewer but more complex instructions may appear to have lower MIPS but still run faster.

Modern benchmarks often measure performance using real-world applications or standardized test suites, because raw metrics like GHz or MIPS can be misleading on their own. Nonetheless, the CPU performance equation and MIPS provide a valuable framework for understanding the trade-offs between instruction count, CPI, and clock speed.

Adapted from "Patterson and Hennessy, Computer Organization, ARM edition" by Surya Malik and Clayton Cafiero.

No generative AI was used in writing this material. This was written the old-fashioned way.