Memory optimization using prefetch, Software-controlled prefetch, Memory optimization using prefetch -18 – Intel ARCHITECTURE IA-32 User Manual
Page 308: Software-controlled prefetch -18, Example 6-1, Pseudo-code for using cflush -18
![background image](/manuals/127794/308/background.png)
IA-32 Intel® Architecture Optimization
6-18
Memory Optimization Using Prefetch
The Pentium 4 processor has two mechanisms for data prefetch:
software-controlled prefetch and an automatic hardware prefetch.
Software-controlled Prefetch
The software-controlled prefetch is enabled using the four prefetch
instructions introduced with Streaming SIMD Extensions instructions.
These instructions are hints to bring a cache line of data in to various
levels and modes in the cache hierarchy. The software-controlled
prefetch is not intended for prefetching code. Using it can incur
significant penalties on a multiprocessor system when code is shared.
Software prefetching has the following characteristics:
•
Can handle irregular access patterns, which do not trigger the
hardware prefetcher.
•
Can use less bus bandwidth than hardware prefetching; see below.
•
Software prefetches must be added to new code, and do not benefit
existing applications.
Example 6-1
Pseudo-code for Using cflush
while (!buffer_ready} {}
mfence
for(i=0;i clflush (char *)((unsigned int)buffer + i) } mfence prefnta buffer[0]; VAR = buffer[0];