beautypg.com

Configuring distributed crawling and serving – Google Search Appliance Configuring Distributed Crawling and Serving version 7.2 User Manual

Page 8

background image

Google Search Appliance: Configuring Distributed Crawling and Serving

8

Configuring Distributed Crawling and Serving

Observe the following precautions in configuring distributed crawling:

Do not configure a unified environment and distributed crawling.

Feeds must be configured only on the admin master search appliance.

If the search appliances you are using in the distributed crawling and serving configuration crawled
similar document bodies, Google recommends that you reset the indexes on the nonmaster search
appliances before configuring distributed crawling and serving.

To configure distributed crawling and serving:

1.

Log in to the Admin Console of the machine intended to be the master search appliance.

2.

If the crawl is currently running or if the search appliance already has an index from which it is
serving, click Content Sources > Diagnostics > Crawl Status > Pause Crawl.

3.

Click GSA

n

> Configuration.

4.

Type the number of shards in the Number of shards field. A shard in the distributed crawling
configuration comprises a primary search appliance, and optionally one more search appliances
(replicas) in a mirroring configuration.

5.

Type the total number of nodes (search appliances) to be configured in the Number of nodes field.
This number includes the primary search appliances, as well as replica search appliances to be
configured.

Determine whether the
search appliances in the
configuration crawled
substantially similar
bodies of documents.

If the search appliances crawled similar bodies of
documents, the indexes are substantially similar and
rebalancing the index after you set up the distributed
crawling and serving configuration will be inefficient. In this
situation, Google recommends that you reset the index on
the non-master nodes before you set up the configuration.

Configure feeds only on
the master.

Feeds can only be indexed on the master.

If you are using Kerberos,
ensure that you
configure Kerberos on
the master and non-
master nodes.

Kerberos keytab files are unique and cannot be used on
more than one search appliance. You must generate and
import a different Kerberos keytab file for each search
appliance. When you configure Kerberos on a non-master
node, use a different Mechanism Name from the one used
for the master. The Mechanism Name for the non-master
node will be synchronized automatically with the master’s
Mechanism Name. After they are synchronized, the non-
master node’s Mechanism Name will match the master’s
Mechanism Name.

If you are using SSL
certificates, ensure that
you install them on the
master and non-master
nodes.

Task

Description

Your Values