This paper proposes and evaluates software techniques that increase
register file utilization for simultaneous multithreading (SMT)
processors. SMT processors require large register files to hold multiple
thread contexts that can issue instructions, out of order, every cycle. By
supporting better inter-thread sharing and management of physical registers,
an SMT processor can reduce the number of registers required and can improve
performance for a given register file size.
Our techniques specifically target register deallocation. While out-of-order processors with register renaming are effective at knowing when a new physical register must be allocated, they are limited in knowing when physical registers can be deallocated. We propose architectural extensions that permit the compiler and operating system to (1) free registers immediately upon their last use, and (2) free registers allocated to idle thread contexts. Our results, based on detailed instruction-level simulations of an SMT processor, show that these techniques can increase performance significantly for register-intensive, multithreaded programs.
To get the PostScript file, click here. For PDF, click here.