How does the search appliance work, Installation, Crawl – Google Search Appliance Planning for Search Appliance Installation User Manual
Page 6

Google Search Appliance: Planning for Search Appliance Installation
6
How Does the Search Appliance Work?
The Google Search Appliance is a one-stop search and index solution for businesses of all sizes. Using a
search appliance, you can quickly deploy search within an enterprise. By default, a search appliance can
index and serve content located on a file system or a web server. You can also configure the Google
Search Appliance to use a connector manager and a connector to index and serve content located in a
content management system such as EMC Documentum or Microsoft SharePoint.
The search appliance comes with Google software installed on powerful hardware, simplifying the
planning process because you do not need to choose a hardware platform.
The Google Search Appliance model GB-7007 can be licensed for up to 10 million documents and the
Google Search Appliance model GB-9009 for up to 30 million documents.
The Google Search Appliance model G100 can be licensed for up to 20 million documents and the
Google Search Appliance model G500 for up to 100 million documents.
This section contains an introduction to the basic operations of the Google Search Appliance and
descriptions of the preinstallation planning process.
Installation
Before an intranet, web site, or content repository can be indexed, you must install the search appliance
on your network and set up the software on the appliance. Installing the search appliance requires
physically attaching it to the network and then starting the search appliance.
Setting up the software on a search appliance includes the following tasks:
•
Ensuring that the correct ports are available on your network.
•
Providing correct network settings, so that the search appliance can communicate with the
computers on the network
•
Providing email and time settings
•
Assigning a password to the administrator account
•
Ensuring that the search appliance has access to the file system or web servers where the content
files are located.
•
Configuring the initial crawl of your file system or web servers
If you are indexing content in a content management system, you must also install a connector
manager and the connector for the particular content management system. Review the documentation
set for the correct connector software version (
), which
provides information on preinstallation tasks, required software, and required hardware for the
connector manager and connectors.
Crawl
Crawl is the process by which the Google Search Appliance locates content to be indexed. Crawl is a pull
process, where the search appliance pulls content from the content location. The search appliance can
also crawl a relational database to obtain metadata.
When you configure the software for crawling, you define three sets of URLs, which can be in HTTP or
server message block (SMB) format: