5 testing your configuration, 1 examining and troubleshooting, 1 on the server – HP StorageWorks Scalable File Share User Manual
Page 39
5.5 Testing Your Configuration
The best way to sanity test your Lustre file system is to perform normal file system operations,
such as normal Linux file system shell commands like df, cd and ls. If you want to measure
performance of your installation, you can use your own application or the standard file system
performance benchmarks described in Chapter 17 Benchmarking of the Lustre 1.6 Operations Manual
at:
.
5.5.1 Examining and Troubleshooting
If your file system is not behaving properly, you can refer to information in the Lustre 1.6 Operations
Manual, PART III Lustre Tuning, Monitoring and Troubleshooting. There are also many important
commands for file system operation and analysis described in the Part V Reference section,
including lctl, lfs, tunefs.lustre and debugfs. Some of the most useful diagnostic and
troubleshooting commands are briefly described below.
5.5.1.1 On the Server
Use the following command to check the health of the system.
# cat /proc/fs/lustre/health_check
healthy
This returns healthy if there are no catastrophic problems. However, other less severe problems
that prevent proper operation might still exist.
Use the following command to show the LNET network interface active on the node.
# lctl list_nids
172.31.97.1@o2ib
Use the following command to show the Lustre network connections that the node is aware of,
some of which may not be currently active.
# cat /proc/sys/lnet/peers
nid refs state max rtr min tx min queue
0@lo 1 ~rtr 0 0 0 0 0 0
172.31.97.2@o2ib 1 ~rtr 8 8 8 8 7 0
172.31.64.1@o2ib 1 ~rtr 8 8 8 8 6 0
172.31.64.2@o2ib 1 ~rtr 8 8 8 8 5 0
172.31.64.3@o2ib 1 ~rtr 8 8 8 8 5 0
172.31.64.4@o2ib 1 ~rtr 8 8 8 8 6 0
172.31.64.6@o2ib 1 ~rtr 8 8 8 8 6 0
172.31.64.8@o2ib 1 ~rtr 8 8 8 8 6 0
Use the following command on an MDS server or client to show the status of all file system
components, as shown below. On an MGS or OSS server, it only shows the components running
on that server.
# lctl dl
0 UP mgc MGC172.31.103.1@o2ib 81b13870-f162-80a7-8683-8782d4825066 5
1 UP mdt MDS MDS_uuid 3
2 UP lov hpcsfsc-mdtlov hpcsfsc-mdtlov_UUID 4
3 UP mds hpcsfsc-MDT0000 hpcsfsc-MDT0000_UUID 195
4 UP osc hpcsfsc-OST000f-osc hpcsfsc-mdtlov_UUID 5
5 UP osc hpcsfsc-OST000c-osc hpcsfsc-mdtlov_UUID 5
6 UP osc hpcsfsc-OST000d-osc hpcsfsc-mdtlov_UUID 5
7 UP osc hpcsfsc-OST000e-osc hpcsfsc-mdtlov_UUID 5
8 UP osc hpcsfsc-OST0008-osc hpcsfsc-mdtlov_UUID 5
9 UP osc hpcsfsc-OST0009-osc hpcsfsc-mdtlov_UUID 5
10 UP osc hpcsfsc-OST000b-osc hpcsfsc-mdtlov_UUID 5
11 UP osc hpcsfsc-OST000a-osc hpcsfsc-mdtlov_UUID 5
12 UP osc hpcsfsc-OST0005-osc hpcsfsc-mdtlov_UUID 5
5.5 Testing Your Configuration
39