Setting up the crawl – Google Search Appliance Installing the Google Search Appliance User Manual

Page 11

Google Search Appliance: Installing the Google Search Appliance

20. Click Check Access to Web Servers.

The search appliance attempts to crawl the content files. If the search appliance cannot reach any
of the locations, error messages appear in the configuration wizard interface.

You see a message saying the following:

Congratulations! You have configured the appliance. If no warnings or errors
have been displayed, you can now disconnect your laptop and use the appliance.
Use the Admin Console application for day-to-day administration.

The configuration values are listed. You can disconnect the local computer or make further changes
to the settings. To change a value, click the Edit Settings link.

Determining Whether the Software Version is Current

The software preinstalled on your Google Search Appliance might not be the most recent version. After
you configure the search appliance, Google recommends that you check the software version on the
search appliance, then visit the Support site to check for updates.

To determine the version of the software current installed on your search appliance, click the About link
on any Admin Console page. You see a new page that displays the search appliance version.

Visit the following URL for instructions on how to log in to the Support portal:

http://support.google.com/enterprisehelp/bin/answer.py?answer=1120726

When you visit the Support portal, navigate to the page that lists supported software versions, then
determine which is the most recent version and the proper update path to follow.

The next section, “Setting Up the Crawl” on page 11, contains instructions for connecting to the Admin
Console and configuring the initial crawl of your content.

Setting Up the Crawl

Crawl is the process by which the Google Search Appliance locates content to be indexed. You define the
start URLs to be crawled, URLs that are crawled or excluded from the crawl, and file types to include or
exclude.

When you complete the process described in “Configuring the Network Settings” on page 8, the crawl is
not started. This section contains instructions for connecting to the Admin Console, entering start URLs
and URL patterns, starting the crawl process, and confirming that the crawl is proceeding normally. For
complete information on crawl, start URLs, and URL patterns, see Administering Crawl.

To obtain context-sensitive help from any page in the Admin Console, click the Help link. You can also
view help pages when you click the Help Center link in the horizontal blue bar in the upper right of the
screen.

The high-level steps for setting up the initial crawl are:

“Logging in to the Admin Console” on page 12

“Setting Up and Starting the Crawl” on page 12

“Checking the Crawl Status” on page 13

“Checking the Serving Status” on page 14