Itanium instruction metrics, Hypertransport transmit and receive events – HP XC System 3.x Software User Manual
Page 57
Itanium Instruction Metrics
On Itanium processors, the event counter IA64_INST_RETIRED includes retired instructions
and retired no operation instructions (NOP_RETIRED) but not retired predicate squashed
instructions (PREDICATE_SQUASHED_RETIRED).
•
To calculate the total number of retired instructions, add IA64_INST_RETIRED and
PREDICATE_SQUASHED_RETIRED
.
1
•
To determine the number of effective retired instructions, subtract NOP_RETIRED from
IA64_INST_RETIRED
.
Use the event set IPCEvents (hpcpid -events IPCEvents) to monitor all the events needed
to calculate instructions per cycle (CPU_CYCLES, IA64_INST_RETIRED,
PREDICATE_SQUASHED_RETIRED
, and NOP_RETIRED).
To calculate the number of total retired instructions per cycle, use the following formula:
(IA64_INST_RETIRED + PREDICATE_SQUASHED_RETIRED)/CPU_CYCLES
Measuring Memory Controller and HyperTransport Events
Memory controller events, such as DRAM access, and HyperTransport events are system events.
On multicore processors, these events can be monitored only from core 0. These events are not
attributed to the process or thread that caused them if the process or thread does not execute on
core 0; instead, they are attributed to the process running on core 0 when the event is recorded.
To correctly measure memory controller and HyperTransport events on multicore processors,
restrict execution of the process or threads to a CPU that is core 0. You can use the contents of
the /proc/cpuinfo file to determine which CPUs are core 0 and the taskset utility to launch
a process with a specified CPU affinity.
HyperTransport Transmit and Receive Events
HyperTransport only monitors transmit events (data, command, and transmit event types). There
is no direct way to monitor HyperTransport receive events. However, you can infer receive
events by observing the transmit events on the sender. For example, memory requests from a
process running on CPU 1 for memory attached to CPU 2 generates HyperTransport transmit
requests on CPU 1.
Accessing memory from a remote processor generates HyperTransport traffic. A process might
accesses memory that is more than one hop away, through an intermediate CPU. For example,
a memory request from CPU 1 to CPU 3 might be transmitted through CPU 2. In this case, there
will be HyperTransport transmit events on CPU 1 and CPU 2.
1. “Errata (Processor and PAL)” in Intel® Itanium® 2 Processor Specification Update February 2005 states that the
IA64_RETIRED
event count does not include predicated off instructions.
Tips and Best Practices for Using HPCPI
57