beautypg.com

Serving from master and nonmaster nodes, About security – Google Search Appliance Configuring Distributed Crawling and Serving version 7.2 User Manual

Page 6

background image

Google Search Appliance: Configuring Distributed Crawling and Serving

6

After the distributed crawl configuration is set up, the four search appliances behave as if they are a
single search appliance. Crawling, serving, collections, front ends, and other features are configured on
Shard 0, the master node of the configuration. Feeds are sent only to the admin master. The crawl
process is automatically distributed among the four search appliances. Any of the nodes can serve
results. Each search appliance in the distributed crawl configuration communicates with all of the other
search appliances. The diagram above does not show each of the connections between search
appliances.

After the configuration is set up, you can add nodes on the Admin Console and the index will
automatically be redistributed among the existing and new nodes. You can delete nodes by disabling
distributed crawling and serving, resetting the index on each search appliance, and reconfiguring
distributed crawling and serving, then reindexing the content.

Serving from Master and Nonmaster Nodes

In this release, you can serve results from both the master and nonmaster nodes in distributed crawling
and serving configurations whether or not you have replicas configured and regardless of whether the
mirroring configuring is active-active or active passive.

If you are using a load balancer, a client creates a separate session for each node that it uses. In some
cases, this might slow down initial searches because of the overhead added by uses authentication
requests. You can minimize this issue by using a sticky load balancer that can preserve user sessions for
time periods of five minutes or more. In the absence of a sticky load balancer, search users may have to
log in N times, where N is the number of search appliances in the configuration.

About Security

The Google Search Appliance uses secret tokens and private IP addresses to enforce security within a
distributed crawling configuration.

The search appliances in a distributed crawling configuration authenticate each other using shared
secret tokens that you provide during configuration. The shared secret tokens must consist only of
printable ASCII characters.

There are no restrictions on the public IP addresses assigned to the search appliances in the
configuration beyond a requirement that a search appliance must able to reach another search
appliance’s public IP address on UDP port 500 and on IP protocol number 51 (IPsec AH). Both ports are
used by IPSec, the security protocol for communications among the appliances in the configuration.

Certain communications among the search appliances in a distributed crawling configuration are
conducted over a virtual private network, including search requests, search credentials transmitted as
sessions, and search results that include snippets, whether the results are authorized or not authorized.
When you set up a distributed crawling configuration, you must assign the private IP addresses and
secret tokens to each machine in the configuration.

The following guidelines apply to the private network IP addresses that you assign in a distributed
crawling configuration:

You can assign or change the private IP addresses at any time.

The private IP addresses must be different from the IP addresses that will be crawled on your
internal network. For example, if you use 10.0.0.0/8 for your intranet then you should choose the
private IP addresses from the 192.168.0.0/24 network. If the 192.168.0.0/24 network is also in use,
try 192.168.1.0/24 or the 172.16.0.0/12 range.