beautypg.com

Vmware fault tolerance architecture, Deterministic record/replay, Fault tolerance – VMware vSphere Fault Tolerance 4 User Manual

Page 3: Deterministic record/replay 1.2. fault tolerance, Disk i/o timer event

background image

3

VMware white paper

VMware® Fault Tolerance (FT) provides continuous availability to virtual machines, eliminating downtime and disruption — even in
the event of a complete host failure. This whitepaper gives a brief description of the VMware FT architecture and discusses the
performance implication of this feature with data from a wide variety of workloads.

1. VMware Fault Tolerance architecture

The technology behind VMware Fault Tolerance is called VMware® vLockstep. The following sections describe some of the key aspects
of VMware vLockstep technology.

1.1. Deterministic record/replay

Deterministic Record/Replay is a technology introduced with VMware Workstation 6.0 that allows for capturing the execution of a
running virtual machine for later replay. Deterministic replay of computer execution is challenging since external inputs like incoming
network packets, mouse, keyboard, and disk I/O completion events operate asynchronously and trigger interrupts that alter the code
execution path. Deterministic replay could be achieved by recording non-deterministic inputs and then by injecting those inputs at
the same execution point during replay (see

Figure 1

). This method greatly reduces processing resources and space as compared to

exhaustively recording and replaying individual instructions.

Figure 1. Event Injection during Replay

Disk I/O

Timer Event

In order to efficiently inject the inputs at the correct execution point, some processor changes were required. VMware collaborated
with AMD and Intel to make sure all currently shipping Intel and AMD server processors support these changes. See

KB article

1008027

for a list of supported processors.

VMware currently supports record/replay only for uniprocessor virtual machines. Record/Replay of symmetric multi-processing (SMP)
virtual machines is more challenging because in addition to recording all external inputs, the order of shared memory access also has
to be captured for deterministic replay.

1.2. Fault Tolerance Logging Traffic

Figure 2

shows the high level architecture of VMware Fault Tolerance.

VMware FT relies on deterministic record/replay technology described above. When VMware FT is enabled for a virtual machine (“the
primary”), a second instance of the virtual machine (the “secondary”) is created by live-migrating the memory contents of the primary
using VMware® VMotion™. Once live, the secondary virtual machine runs in lockstep and effectively mirrors the guest instruction
execution of the primary.

The hypervisor running on the primary host captures external inputs to the virtual machine and transfers them asynchronously to the
secondary host. The hypervisor running on the secondary host receives these inputs and injects them into the replaying virtual machine
at the appropriate execution point. The primary and the secondary virtual machines share the same virtual disk on shared storage, but
all I/O operations are performed only on the primary host. While the hypervisor does not issue I/O produced by the secondary, it posts
all I/O completion events to the secondary virtual machine at the same execution point as they occurred on the primary.