Google Search Appliance Configuring Distributed Crawling and Serving version 6.14 and later User Manual

Page 8

Google Search Appliance: Configuring Distributed Crawling and Serving

Before You Configure Distributed Crawling
and Serving

This section provides a checklist of information you need to collect and decisions you need to make
before you configure distributed crawling and serving.

Task

Description

Your Values

Determine which Google
Search Appliance will
participate in the
configuration.

Any Google Search Appliance model running software
version 6.0 or later can participate, but all search appliances
must be the same model running the same software
version.

Determine the appliance
IDs of the participating
search appliances

The appliances IDs can be found on the Admin Console
under Administration > License or by right-clicking the
About link on any Admin Console page and choosing Open
link in new tab.

Determine the host
names or public IP
addresses of the search
appliances in the
configuration.

The host names or IP addresses are required during the
initial configuration process.

Determine the virtual
private network IP
addresses for the search
appliances.

The network IP addresses are used for private
communication among the search appliances in the
configuration. The network IP addresses must conform to
the private address space as defined in RFC 1918 and must
not overlap with any other private address space in use on
your network.

Determine which search
appliance is the master
search appliance in the
configuration.

Crawl, search, and index are all configured on the primary
search appliance.

Determine the secret
token that the search
appliances will use to
recognize each other
within the configuration.

The nodes in the configuration use the secret tokens to
authenticate to each other. The secret token must include
only printable ASCII characters. Each search appliance in a
distributed crawling configuration has its own associated
secret token, which you specify on the GSA

> Host

Configuration page.

Determine whether the
master node is crawling
or has an index from
which it is serving.

Do not start the crawl on the node before configuring
distributed crawling and serving.

Determine whether the
search appliances in the
configuration crawled
substantially similar
bodies of documents.

If the search appliances crawled similar bodies of
documents, the indexes are substantially similar and
rebalancing the index after you set up the distributed
crawling and serving configuration will be inefficient. In this
situation, Google recommends that you reset the index on
the non-master nodes before you set up the configuration.

Configure feeds only on
the master.

Feeds can only be indexed on the master.