9 troubleshooting supplementary groups access, 9 troubleshooting supplementary groups access -31 – HP StorageWorks Scalable File Share User Manual
Page 255

Troubleshooting file systems
9–31
9.26.9 Troubleshooting supplementary groups access
If a user receives unexpected
access denied
errors when using supplementary groups, use the following
procedures and information to troubleshoot the problem:
•
Ensure that
ssh
access is set up correctly (refer to the Configuring supplementary groups section in
Chapter 9 of the HP StorageWorks Scalable File Share System Installation and Upgrade Guide for
more information).
•
Verify that the group servers are being accessed regularly (through the
ssh
utility) by the MDS server.
If the group servers are being accessed regularly, there will be
sshd
messages in the
/var/log/
messages
file on the group servers. You may need to wait for the cache timeout period to expire
before you can expect to see a successful
ssh
access reported.
•
Enter the
hpls_getgroups
command on the group servers and verify that it returns the correct
group information for the UID of the specified user.
•
Examine the event log during the time that the denied user attempts to access the file.
•
Attempt to force the upcall mechanism to start and then examine the event log again; however,
because the group information is cached, you must wait until the cache timeout period between
access attempts has expired before you attempt to force the upcall mechanism to start.
To force the upcall mechanism to start, and to examine the event log, enter the commands shown in
the following example, where
south-mds1
is the MDS service, and
538
is a user UID:
# usr/opt/hpls/bin/hpls_groups_upcall south-mds1 538
# sfsmgr show log recent
.
.
.
Nov 17 10:56:38 south2-adm hpls_groups_upcall: write(148) failed: Invalid argument
There will be an error in the log, as shown in the above example; this error is returned because the
MDS service did not request the
hpls_groups_upcall
script to run, and the message is normal in
this context. Any other error messages are not normal and may indicate why the user is being denied
access.
•
The five-second period allowed for the
upcall/ssh
process is specified by the
/proc/fs/
lustre/mds/filesystem_name-mdsnumber/group_acquire_expire
file. You can
determine the allowed period for a file system as shown in this example, where the file system is
called
data
and has an MDS service called
mds5
:
# cat /proc/fs/lustre/mds/data-mds5/group_acquire_expire
5
If you suspect that the
upcall/ssh
process is failing because it is taking too long, you can increase
the interval as follows:
# cat 20 > /proc/fs/lustre/mds/data-mds5/group_acquire_expire
However, you must make this change each time the MDS service starts or restarts.
•
Look in the event log for events such as the following:
group_upcall: Failed to get groups for uid 100, timeout waiting to connect to
172.100.100.100
You can search for such events using this example query:
sfs> show log facility=lustre && data contains “timeout waiting to connect to”
If you find such events, increase the value of the
lustre.groups_ssh_timeout
attribute, as
described in the Configuring supplementary groups section in Chapter 9 of the HP StorageWorks
Scalable File Share System Installation and Upgrade Guide.