42 troubleshooting client mount failures, 42 troubleshooting client mount failures -69 – HP StorageWorks Scalable File Share User Manual
Page 293
Troubleshooting client mount failures
9–69
9.42 Troubleshooting client mount failures
When a node is booting, you can monitor the progress on the console. When the SFS service starts on the
node, it prints a message similar to the following:
Mounting sfs filesystems: sfsalias/scratch
If all the file systems mount successfully, the SFS service exits and the message changes to something similar
to the following:
Mounting sfs filesystems: sfsalias/scratch sfsalias/workspace sfsalias/data [ OK ]
If a file system is being mounted in the background, the SFS service displays a
[WARNING]
code when it
exits.
If a mount operation stalls, you can troubleshoot the problem as follows:
1.
Log in to the HP SFS system. If you are logging in from a client node, log in as follows:
# ssh sfsalias
2.
Start the SFS CLI and run the
show filesystem
command as follows:
# sfsmgr
.
.
.
sfs> show filesystem
Name State Services
------------- -------------- ----------------------------------
data started mds4: running, ost[145-184]: running
scratch started mds5: running, ost[185-188]: running
3.
Examine the output from the command, and consider the following factors:
•
If any of the file system services are not in the
running
or
recovering
state, there is a
problem with the file system. You must correct the problem before proceeding; otherwise, the file
system cannot be mounted on any node.
In some cases, the output from the
show filesystem
command may indicate that there is a
problem with the service, but that the service is still running. For example, a service in the
data
file system might show a status similar to the following:
running(raid: degraded)
This indicates that the service is mirrored over two arrays, and that one of the arrays has failed.
In this case, although the situation needs attention, the service is running and the problem will
not affect mount operations.
•
If all file system services are in the
running
state, you can expect mount operations to complete
in relatively short times. However there are several factors that might extend the mount time to
several minutes, including the following:
•
One or more MDS or OST services are running on their backup server. The mount
operation first attempts to connect to the preferred server for the service. If this attempt fails
because the primary server is down, Lustre attempts to connect to the backup server after
approximately 50 seconds.
•
An InfiniBand port is not in the
PORT_ACTIVE
state.
In this case, the SFS service does not attempt to mount the file system; instead, it polls the
status of the port every few seconds. If the mount operation is taking place in the
background (that is, the
bg
option was specified for the mount operation), the poll takes
place in the background. However, if the mount operation is taking place in the
foreground, the boot process stalls in the SFS service until InfiniBand has finished
initializing.