Figure 6, Lu performance – Dell PowerEdge R820 User Manual
Page 15
Performance Analysis of HPC Applications on Several Dell PowerEdge 12
th
Generation Servers
15
LU performance
Figure 6.
4.4. WRF
The performance of WRF on the three clusters is shown in Figure 7. The Dell PowerEdge M620 performs
increasingly better than the PowerEdge R820 as the cluster size increases. From a previous study of the
impact of memory bandwidth [8], it is observed that a 16 percent drop in memory bandwidth translates
to a 4 percent drop in WRF performance. In this case, the drop in memory bandwidth per core of the
PowerEdge R820 when compared to the PowerEdge M620 is ~30 percent. Thus, a portion of this
performance drop on the PowerEdge R820s can be attributed to the lower memory bandwidth per core.
However, for a fixed cluster size, fewer PowerEdge R820 servers will be needed to achieve this level of
performance since it is a four socket system.
WRF is impacted by the difference in processor frequency on the PowerEdge M420. The PowerEdge
M420 cluster performs consistently lower than the PowerEdge M620s by a factor of ~15 percent until
128 cores. There is a 23 percent drop in memory bandwidth per core on the PowerEdge M420 when
compared to the PowerEdge M620, which also contributes to this drop in performance.
At 256 cores, the PowerEdge M420s perform 11 percent better than the PowerEdge M620s. This data
point is repeatable and is not explained by the difference in the InfiniBand network architecture
between these two clusters. At the time of writing, this aspect was still under study.
0.91
0.92
0.92
1.04
1.03
0.94
0.89
0.89
0.89
0.88
0.93
0.00
0.20
0.40
0.60
0.80
1.00
1.20
256
128
64
32
16
8
Pe
rfor
m
ance
R
el
at
ive
t
o
Pow
er
Edg
e
M62
0
(H
ig
he
r
is B
et
te
r)
Number of Cores
M620-2.7GHz
R820-2.7GHz
M420-2.3GHz