The main question here is figuring out if economies of scale are applicable.
Outline:
- Local generation (parallelizable)
- Whole-program Analysis (hard to parallelize)
- Local transformations (parallelizable)
Alias Analysis
Compiler optimizations are often concerned about the parts of memory each statement reads to which is easy for scalar variables stored on the stack. This is harder when talking about pointers and things that can alias. In Rust, “borrowing” makes this a bit easier.
Note
If two pointers don’t alias, we know their effects are independent and we can move them around.
Shape analysis builds on pointer analysis to determine that data structures are trees rather than lists so we can do branches in parallel, etc.
Call Graph Analysis
Many inter-procedural analyses require accurate Call Graphs. We need to know what calls what here.
Devirtualization
This is a specific version of call graph information being put to use. This particular optimization tries to convert a virtual function call to a direct call. In general, virtual function calls have the potential to be slow since there is a branch to predict. Here:
- We skip vtable access
- We skip trying to figure out what kind of thing something is
- Go straight to valid implementation
Inlining
This instructs the compiler to just insert function code where its called so there’s no function call overhead. You can tell the Rust compiler to inline a function using a hint #[inline] where you can also specify if it always inlines or never inlines. As you inline, the binary size of your program increases. By increasing inlining, you decrease cache hits and make more trips to memory somehow? Something to do with having to always fetch duplicate instructions?
This is also a suggestion, the compiler could ignore this.
This also makes debugging more hard since you can’t breakpoint a function that doesn’t exist. Also, if you change an inlined function in a library, the user needs to recompile their program.
Tail Recursion Elimination
A function is tail-recursive, if the recursive call is the very last operation performed by the function before returning its value. This means the recursive call is the last thing a function does.
If function A is called by function B, and the return value of A is immediately taken by B and returned by B, you could skip all of this and send data in a more efficient manner by reusing the current sack frame reducing overhead and space.