beautypg.com

Memory address parity error – Kontron S5500 SEL Troubleshooting User Manual

Page 69

background image

Memory subsystem

System Event Log Troubleshooting Guide for Intel® S5500/S3420 series Server Boards

60

Intel order number G74211-001

Revision 1.0

Event Trigger Offset

Description

Next Steps

Hex

Description

00h

Correctable ECC
Error threshold
reached

There have been too many (10 or more) correctable ECC errors for this particular DIMM
since last boot. This event in itself does not pose any direct problems as the ECC errors
are still being corrected. Depending on the RAS configuration of the memory, the IMC may
take the affected DIMM offline

Even though this event doesn't immediately lead to
problems it can indicate one of the DIMM modules
is slowly failing. If this error occurs more than once:

1. If needed, decode DIMM location from hex

version of SEL.

2. Verify DIMM is seated properly.

3. Examine gold fingers on edge of DIMM to

verify contacts are clean.

4. Inspect processor socket this DIMM is

connected to for bent pins, and if found,
replace the board.

5. Consider replacing the DIMM as a

preventative measure. For multiple
occurrences, replace the DIMM.

9.2.2

Memory Address Parity Error

Address Parity errors are errors detected in the memory addressing hardware. Since these affect the addressing of memory contents, they can
potentially lead to the same sort of failures as ECC errors. They are logged as a distinct type of error since they affect memory addressing
rather than memory contents, but otherwise they are treated exactly the same as Uncorrectable ECC Errors. Address Parity errors are logged
to the BMC SEL, with Event Data to identify the failing address by channel and DIMM to the extent that it is possible to do so.

Table 62: Address Parity Error Sensor Typical Characteristics

Byte

Field

Description

8

9

Generator ID

0033h = BIOS SMI Handler

11

Sensor Type

0ch = Memory

12

Sensor Number

14h