beautypg.com

HP StorageWorks Scalable File Share User Manual

Page 8

background image

viii

9.25.1.1

Determining whether Voltaire InfiniBand interconnect is loaded ...........................................9-16

9.25.1.2

Starting, stopping, and auto-starting the Voltaire InfiniBand interconnect ...............................9-17

9.25.1.2.1

Using the ib-setup utility..............................................................................................9-17

9.25.1.2.2

From the command line..............................................................................................9-18

9.25.1.3

Server hangs when Voltaire InfiniBand interconnect is started ..............................................9-18

9.25.2 Voltaire HCA adapter is not recognized ................................................................................9-18

9.25.3 Voltaire HCA adapter is not activated ...................................................................................9-19

9.25.4 Connection and data transfer problems .................................................................................9-20

9.25.5 AD_TAVOR : vvi_mlx_poll_for_completion messages...............................................................9-20

9.26 Troubleshooting file systems......................................................................................................9-21

9.26.1 Problems creating a file system.............................................................................................9-21

9.26.2 Identifying servers serving OST services.................................................................................9-22

9.26.3 The start filesystem command may fail twice...........................................................................9-22

9.26.4 Troubleshooting the stop filesystem command.........................................................................9-23

9.26.5 Using the MPI Lustre repair utility to repair file systems ............................................................9-24

9.26.5.1

Using the repair-lfsck script and the generated file system-specific shell scripts .......................9-25

9.26.5.1.1

Running a generated file system repair script ................................................................9-26

9.26.5.2

Repairing or verifying individual MDS or OST services .......................................................9-27

9.26.6 MDS or OST services stay in the recovering state....................................................................9-28

9.26.7 MDS and OST service recovery process ................................................................................9-28

9.26.8 Rebalancing file system services ...........................................................................................9-30

9.26.9 Troubleshooting supplementary groups access........................................................................9-31

9.27 Troubleshooting file system performance ....................................................................................9-32

9.27.1 Performance troubleshooting ................................................................................................9-32

9.27.2 Verifying file striping ...........................................................................................................9-36

9.27.2.1

Recreating files ..............................................................................................................9-38

9.27.3 Checking for unbalanced distribution of OST services .............................................................9-39

9.27.4 Checking for unbalanced controllers in EVA4000 arrays.........................................................9-40

9.27.5 Examining the system logs for errors .....................................................................................9-41

9.27.6 Examining EVA4000 storage subsystems for errors.................................................................9-42

9.27.7 Examining SFS20 storage subsystems for errors......................................................................9-42

9.27.8 Examining the interconnect switch for errors...........................................................................9-43

9.27.9 Verifying performance statistics on Fibre Channel switches ......................................................9-44

9.27.10 Troubleshooting slow commit messages .................................................................................9-46

9.28 Troubleshooting EVA4000 array connectivity..............................................................................9-47

9.29 Troubleshooting LUN presentation .............................................................................................9-49

9.30 Accessing consoles..................................................................................................................9-51

9.31 Accessing the iLO component ...................................................................................................9-51

9.31.1 Configuring the iLO component............................................................................................9-51

9.31.1.1

Using the remote console ................................................................................................9-51

9.31.1.2

Using a Web browser ....................................................................................................9-52

9.31.2 Troubleshooting iLO access..................................................................................................9-52

9.32 Troubleshooting licenses...........................................................................................................9-53

9.33 Troubleshooting failed SFS20 arrays..........................................................................................9-54

9.33.1 Identifying failed SFS20 arrays.............................................................................................9-54

9.33.2 Recovering from a temporary SFS20 array failure...................................................................9-56

9.33.3 Recovering degraded MDS or OST services...........................................................................9-56

9.34 Handling Disk Errors on SFS20 storage......................................................................................9-59

9.34.1 Disks showing the removed/failed state.................................................................................9-60

9.34.2 Disks showing the predict fail state........................................................................................9-60

9.34.3 Disks showing the logging errors state...................................................................................9-60

9.35 Recovering degraded MDS services on systems using EVA4000 storage........................................9-61

9.36 System log files .......................................................................................................................9-64

9.37 Administration service restarts every one minute (attempting to start the evlogd daemon)..................9-64

9.38 The MDS service fails with an ASSERTION(ino ==inode->i_ino) message .......................................9-65

9.39 The MDS service repeatedly crashes with an LBUG error..............................................................9-65

9.40 Rebuilding logical drives after disk failures .................................................................................9-66

9.41 Determining if the Network ID of a server on a Quadrics or Myrinet interconnect has been changed.9-68

9.42 Troubleshooting client mount failures..........................................................................................9-69