beautypg.com

Compaq W4000 User Manual

Page 49

background image

Technical Reference Guide

Figure 3-1 illustrates the internal architecture of the Pentium 4 processor.

Out-of-

Order Core

Execution

Trace Cache

Branch

Prediction

Rapid Exe. Eng.

ALUs

FSB

I/F

256-KB

8-Way

L2

Adv.

Transfer

Cache

L1

Data

Cache

128-bit
Integer

FPU



CPU

Pentium 4 Processor

ALU Speed: Core speed x2

Core Speed: 1.4, 1.5, 2.0, 2.2 GHz

FSB Speed: 400 MHz (effective data transfer rate)

















Figure 3–2. Pentium 4 Processor Internal Architecture

The Pentium 4 increases processing speed with higher clock speeds made possible with hyper-
pipelined technology that can handle significantly more instructions at a time. Since branch mis-
predicts would result in serious performance hits with such a long pipeline, the Pentium 4 features
a branch prediction mechanism improved with the addition of an execution trace cache and a
refined prediction algorithm. The execution trace cache can store 12k micro-ops (decoded
instructions dealing with branching sequences) that are checked when re-occurring branches are
processed. Code that is not executed (bypassed) is no longer stored in the L1 cache as was the
case in the Pentium III.

The out-of-order core features Advanced Dynamic Execution, which provides a large window
(126 instructions) for execution units to work with. A more accurate branch prediction algorithm,
along with a larger (4-KB) branch target buffer that stores more details on branch history results
in a 33% reduction in branch mis-predictions over the Pentium III.

The L1 data cache features a low-latency design for minimum response to cache hits. The 256-KB
advanced transfer L2 cache features a 256-bit (32-byte) interface operating at processing speed.
The L2 cache of the 1.5 GHz Pentium 4 can therefore provide a transfer rate of 48 GB/s.

The combined improvements of the Pentium 4’s CPU core the rapid execution engine’s ALUs to
operate at twice the processing frequency to handle the steady stream of instructions.

The front side bus (FSB) of the Pentium 4 uses a 100-MHz clock but provides bi- and quad-
pumped transfers through the use of 200- and 400-MHz strobes. The Pentium 4 can transfer a
complete 64-byte cache line in two 100-MHz bus cycles for a throughput rate of 3.2 GB/s.
Address information is transferred at a 200-MHz rate.

Compaq Evo and Workstation Personal Computer

Featuring the Intel Pentium 4 Processor

Second Edition - January 2003

3-3