Compilers can schedule, or reorder, instructions so that pipeline stalls occur less frequently.
Before closing out our discussion of pipeline stalls, I should introduce another term that you'll be seeing periodically throughout the rest of this article: instruction latency.
As you can see, two gotos (and thus, two pipeline stalls) have been eliminated in the execution.
A central scheduler dynamically dispatches threads to pipeline resources, to maximize rendering throughput (and decrease the impact of individual pipeline stalls.)
One barrier to achieving higher performance through instruction-level parallelism stems from pipeline stalls and flushes due to branches.
With IBS support, CodeAnalyst can more precisely identify instructions that cause pipeline stalls and cache misses.
The deep buffers that make up the Pentium 4's instruction window (described in detail here), are aimed at eliminating pipeline stalls.
Quite a large percentage of the architectural features of the processors that I've covered over the years have been dedicated to preventing pipeline stalls.
There is also a new instruction to convert floating point values to integers without having to change the global rounding mode, thus avoiding costly pipeline stalls.
Avoid pipeline stalls by rearranging the order of instructions.