Pipelining Parallelism is achieved by starting to execute one instruction before the previous one is finished. The simplest kind overlaps the execution of one instruction with the fetch of the next instruction, as on a RISC. Because two instructions can be processed simultaneously, we say that the pipeline has two stages.
Load and store reference memory, so they take two cycles. A pipeline may have more than two stages. Suppose, for example, that an instruction consists of four phases: 1. Instruction fetch3. Operand fetch 2. Instruction decode4. Execute In a non-pipelined processor, these must be executed sequentially, so that a result is only available each four pipeline cycles (sub cycles): In a pipelined processor, after a delay to load the pipeline, a result is available each pipeline cycle.
The type of pipelining described above achieves instruction-level parallelismexecution of multiple instructions in parallel. It is also possible to use pipelining to achieve data parallelism. A vector processor usually has a long pipeline, and allows a large number of the same operations to take place concurrently. (Same operations, different data ) A single processor may possess multiple pipelines, allowing different operations to use different pipelines (e.g., there might be a specialized addition pipeline, and another load pipeline). For example, the CDC 6600 had ten separate functional units, with a scoreboard to keep track of which was in use at any time.
Branches are a problem for pipelined computers. Execution of some instructions may take longer than others. If there are two (or more) units capable of performing a given function (e.g., multiplication), then two operations of that type may be performed at once.