9 troubleshooting supplementary groups access, 9 troubleshooting supplementary groups access -31 – HP StorageWorks Scalable File Share User Manual

Page 255

Troubleshooting file systems

9–31

9.26.9 Troubleshooting supplementary groups access

If a user receives unexpected

access denied

errors when using supplementary groups, use the following

procedures and information to troubleshoot the problem:

•

Ensure that

ssh

access is set up correctly (refer to the Configuring supplementary groups section in

Chapter 9 of the HP StorageWorks Scalable File Share System Installation and Upgrade Guide for

more information).

•

Verify that the group servers are being accessed regularly (through the

ssh

utility) by the MDS server.

If the group servers are being accessed regularly, there will be

sshd

messages in the

/var/log/

messages

file on the group servers. You may need to wait for the cache timeout period to expire

before you can expect to see a successful

ssh

access reported.

•

Enter the

hpls_getgroups

command on the group servers and verify that it returns the correct

group information for the UID of the specified user.

•

Examine the event log during the time that the denied user attempts to access the file.

•

Attempt to force the upcall mechanism to start and then examine the event log again; however,

because the group information is cached, you must wait until the cache timeout period between

access attempts has expired before you attempt to force the upcall mechanism to start.

To force the upcall mechanism to start, and to examine the event log, enter the commands shown in

the following example, where

south-mds1

is the MDS service, and

538

is a user UID:

# usr/opt/hpls/bin/hpls_groups_upcall south-mds1 538
# sfsmgr show log recent
.

Nov 17 10:56:38 south2-adm hpls_groups_upcall: write(148) failed: Invalid argument

There will be an error in the log, as shown in the above example; this error is returned because the

MDS service did not request the

hpls_groups_upcall

script to run, and the message is normal in

this context. Any other error messages are not normal and may indicate why the user is being denied

access.

•

The five-second period allowed for the

upcall/ssh

process is specified by the

/proc/fs/

lustre/mds/filesystem_name-mdsnumber/group_acquire_expire

file. You can

determine the allowed period for a file system as shown in this example, where the file system is

called

data

and has an MDS service called

mds5

# cat /proc/fs/lustre/mds/data-mds5/group_acquire_expire

If you suspect that the

upcall/ssh

process is failing because it is taking too long, you can increase

the interval as follows:

# cat 20 > /proc/fs/lustre/mds/data-mds5/group_acquire_expire

However, you must make this change each time the MDS service starts or restarts.

•

Look in the event log for events such as the following:

group_upcall: Failed to get groups for uid 100, timeout waiting to connect to
172.100.100.100

You can search for such events using this example query:

sfs> show log facility=lustre && data contains “timeout waiting to connect to”

If you find such events, increase the value of the

lustre.groups_ssh_timeout

attribute, as

described in the Configuring supplementary groups section in Chapter 9 of the HP StorageWorks

Scalable File Share System Installation and Upgrade Guide.