Intel IA-32 User Manual
Page 277
Vol. 3A 7-9
MULTIPLE-PROCESSOR MANAGEMENT
4.
Writes can be buffered.
5.
Writes are not performed speculatively; they are only performed for instructions that have
actually been retired.
6.
Data from buffered writes can be forwarded to waiting reads within the processor.
7.
Reads or writes cannot pass (be carried out ahead of) I/O instructions, locked instructions,
or serializing instructions.
8.
Reads cannot pass LFENCE and MFENCE instructions.
9.
Writes cannot pass SFENCE and MFENCE instructions.
The second rule allows a read to pass a write. However, if the write is to the same memory loca-
tion as the read, the processor’s internal “snooping” mechanism will detect the conflict and
update the cached read before the processor executes the instruction that uses the value.
The sixth rule constitutes an exception to an otherwise write ordered model. Note that the term
“write ordered with store-buffer forwarding” (introduced at the beginning of this section) refers
to the combined effects of rules 2 and 6.
In a multiple-processor system, the following ordering rules apply:
•
Individual processors use the same ordering rules as in a single-processor system.
•
Writes by a single processor are observed in the same order by all processors.
•
Writes from the individual processors on the system bus are NOT ordered with respect to
each other.
The latter rule can be clarified by the example in Figure 7-1. Consider three processors in a
system and each processor performs three writes, one to each of three defined locations (A, B,
and C). Individually, the processors perform the writes in the same program order, but because
of bus arbitration and other memory access mechanisms, the order that the three processors write
the individual memory locations can differ each time the respective code sequences are executed
on the processors. The final values in location A, B, and C would possibly vary on each execu-
tion of the write sequence.
The processor-ordering model described in this section is virtually identical to that used by the
Pentium and Intel486 processors. The only enhancements in the Pentium 4, Intel Xeon, and P6
family processors are:
•
Added support for speculative reads.
•
Store-buffer forwarding, when a read passes a write to the same memory location.
•
Out of order store from long string store and string move operations (see Section 7.2.3,
“Out-of-Order Stores For String Operations in Pentium 4, Intel Xeon, and P6 Family
Processors,” below).