The typical boring standard uniprocessor is the Single Instruction Single Data processor where the Ml in modern processors stands for multiple.

Overview:

  • You can load a bunch of data and perform arithmetic
  • Instructions process multiple data items simultaneously SIMD provides an advantage by using a single control unit to command multiple processing units, this reduces the amount of overhead in the instruction stream.

We can basically use this instead of Threads and compile using rustc defaults to get core loop contents if we try to parallelize a for loop.

In Rust, by default, the compiler will assume a target architecture. Using something too new will cause code to fail on older devices. The packed operation here operates on multiple data elements at once. The implication is that for looping stuff, we don’t need to loop as much.

This is a great use case for cases where loops operate over vectors of data.