beautypg.com

Error reporting and handling, 1 error sources and types, 2 handling and logging system errors – PC Concepts SHG2 DP User Manual

Page 47

background image

Intel® SHG2 DP Server Board Technical Product Specification

Error Reporting and Handling

Revision 1.0

Intel Order Number C11343-001


35

6. Error Reporting and Handling

This section defines how errors are handled by the system BIOS on the Intel SHG2 server
board. Also discussed is the role of BIOS in error handling, and the interaction between the
BIOS, platform hardware, and server management firmware with regard to error handling. In
addition, error-logging techniques are described, and beep codes for errors are defined.

6.1 Error Sources and Types

One of the major requirements of server management is to correctly and consistently handle
system errors. System errors, which can be disabled and enabled individually or as a group,
can be categorized as follows:

PCI bus

Memory correctable- and uncorrectable errors

Sensors

Processor internal error, bus/address error, thermal trip error, temperatures and
voltages, and assisted gunning transceiver logic (AGTL+) voltage levels

Sensors are managed by the BMC. The BMC is capable of receiving event messages from
individual sensors and logging system events.

6.2 Handling and Logging System Errors

This section describes actions taken by the SMI handler with respect to the various categories
of system errors. It covers the events logged by the BIOS, and the format of data bytes
associated with those events. The BIOS is responsible for monitoring and logging certain
system events. The BIOS sends a platform event message to BMC to log the event. Some of
the errors, such as processor failure, are logged during early POST, and not through the SMI
handler.

6.2.1

Logging Format Conventions

The BIOS complies with the Intelligent Platform Management Interface Specification, Revision
1.5
. The BIOS always uses system software ID within the range 00h-1Fh to log errors. As a
result, the generator ID byte is an odd number in the range 01h-3fh. OEM user binary should
use software IDs of 1. The software ID allows external software to find the origin of the event
message.