beautypg.com

1 locked atomic operations – Intel IA-32 User Manual

Page 270

background image

7-2 Vol. 3A

MULTIPLE-PROCESSOR MANAGEMENT

To distribute interrupt handling among a group of processors — When several processors
are operating in a system in parallel, it is useful to have a centralized mechanism for
receiving interrupts and distributing them to available processors for servicing.

To increase system performance by exploiting the multi-threaded and multi-process nature
of contemporary operating systems and applications.

The IA-32 architecture’s caching mechanism and cache consistency are discussed in Chapter 10,
“Memory Cache Control.” The
APIC architecture is described in Chapter 8, “Advanced
Programmable Interrupt Controller (APIC).”
Bus and memory locking, serializing instructions,
memory ordering, and Hyper-Threading Technology are discussed in the following sections.

7.1

LOCKED ATOMIC OPERATIONS

The 32-bit IA-32 processors support locked atomic operations on locations in system memory.
These operations are typically used to manage shared data structures (such as semaphores,
segment descriptors, system segments, or page tables) in which two or more processors may try
simultaneously to modify the same field or flag. The processor uses three interdependent mech-
anisms for carrying out locked atomic operations:

Guaranteed atomic operations

Bus locking, using the LOCK# signal and the LOCK instruction prefix

Cache coherency protocols that insure that atomic operations can be carried out on cached
data structures (cache lock); this mechanism is present in the Pentium 4, Intel Xeon, and
P6 family processors

These mechanisms are interdependent in the following ways. Certain basic memory transactions
(such as reading or writing a byte in system memory) are always guaranteed to be handled atom-
ically. That is, once started, the processor guarantees that the operation will be completed before
another processor or bus agent is allowed access to the memory location. The processor also
supports bus locking for performing selected memory operations (such as a read-modify-write
operation in a shared area of memory) that typically need to be handled atomically, but are not
automatically handled this way. Because frequently used memory locations are often cached in
a processor’s L1 or L2 caches, atomic operations can often be carried out inside a processor’s
caches without asserting the bus lock. Here the processor’s cache coherency protocols insure
that other processors that are caching the same memory locations are managed properly while
atomic operations are performed on cached memory locations.

NOTE

Where there are contested lock accesses, software may need to implement
algorithms that ensure fair access to resources in order to prevent lock
starvation. The hardware provides no resource that guarantees fairness to
participating agents. It is the responsibility of software to manage the fairness
of semaphores and exclusive locking functions.