valgrind --tool=cachegrind --branch-sim=yes {PATH_TO_PROGRAM}

This runs a simulation of how your program interacts with cache and evaluates how your program does on branch predictions. This has a large impact on performance.

This reports data about:

  • L1 Cache
  • D1 Cache
  • LL (L3) Cache As in, how many misses we have on the different levels. This tells us what our optimizations did. We can then see if the optimization trade offs were worth it.