The compiler takes a look at code that results in branch instructions like loops, etc, and it assesses what it thinks will happen. Since we have pipelining, we fetch the next instruction while decoding the previous one, if we need to branch, we might throw away the contents of the pipeline which is expensive.

Methods

The most basic thing we can do is predict that a branch is always taken. If we have for example, 70% accuracy, if a mis-predicted branch costs 20 cycles, we can almost double our performance (see lecture notes).

For loop branches, we can observe that a loop that is taken goes backwards, and a loop that is not taken goes forwards. The compiler can then use this information to encode what it thinks about forward branches (which makes the not-taken branch the one that is more likely).

Instead of these static predictions, we can also make dynamic predictions. For every branch, we can record whether it was taken or not the last time it executed and use this to predict (using a 1 bit scheme).

Using a 2 bit scheme, every time we see a taken branch, we increment the counter for that branch, and we decrement every time we see a not taken branch. Based on our count (using 2 bits 00), we make our prediction.

If we maintain a branch history that points to an index in our 2-bit lookup table, we can get even more accuracy.

Summary

Branch prediction enables pipelining and increased performance. Most branches are predictable now.