High availability overview, Availability requirements, Availability evaluation – H3C Technologies H3C SR8800 User Manual
Page 10: Mtbf, Mttr

1
High availability overview
Communication interruptions can seriously affect widely-deployed value-added services such as IPTV
and video conference. Therefore, the basic network infrastructures must be able to provide high
availability.
There are three effective ways to improve availability:
•
Increasing fault tolerance
•
Speeding up fault recovery
•
Reducing impact of faults on services
Availability requirements
Availability requirements fall into three levels based on purpose and implementation.
Table 1 Availability requirements
Level Purpose
Implementation
1
Decrease system software and
hardware faults
•
Hardware—Simplifying circuit design, enhancing
production techniques, and performing reliability tests.
•
Software—Reliability design and test.
2
Protect system functions from being
affected if faults occur
Device and link redundancy and deployment of switchover
strategies.
3
Enable the system to recover as fast
as possible
Providing fault detection, diagnosis, isolation, and recovery
technologies.
The level 1 availability requirement should be considered during the design and production process of
network devices. The level 2 availability requirement should be considered during network design. The
level 3 availability requirement should be considered during network deployment according to the
network infrastructure and service characteristics.
Availability evaluation
Typically, Mean Time Between Failures (MTBF) and Mean Time to Repair (MTTR) are used to evaluate the
availability of a network.
MTBF
MTBF is the predicted elapsed time between inherent failures of a system during operation. It is typically
expressed in hours. A higher MTBF means a higher availability.
MTTR
MTTR is the average time required to repair a failed system. MTTR in a broad sense also involves spare
parts management and customer services.
MTTR = fault detection time + hardware replacement time + system initialization time + link recovery time
+ routing time + forwarding recovery time. A smaller value of each item, a smaller MTTR, and a higher
availability.