AfterBurner: Efficient Performance Scaling via Post-Retirement Processing
The AfterBurner project looks at improving single-thread performance on both simple and high-performance out-of-order cores in an energy efficient way. Aside from explicit parallelism, this is the primary challenge of multi-core architectures going forward.

The most energy-efficient way to improve single-thread performance is to accelerate low-performing program regions. This approach yields the yields the greatest benefit. It also has a low cost because it doesn’t require high-bandwidth execution, making it applicable to both simple and high-performance cores. Low single-thread performance is caused by squashes due to control and data mis-speculations and by long latency loads and stores which clog the pipeline.

The AfterBurner project also looks at schemes to improve energy-efficiency of existing resources in our-of-order cores.  For example, register reference-count bit-vectors are used to manage register-file state instead of conventional free-list FIFOs.  These structures support register steering which is a key enabler for coarse-grained register-file VDD-gating.

AfterBurner’s multi-purpose infrastructure approach to performance and energy-efficiency reduces cost, simplifies design, and expands applicability to code that suffers from different low-performance events simultaneously. 

Collaborators: Amir Roth, UPenn/DoE, Andrew Hilton, Duke
Funding: NSF CCF-1017654