beautypg.com

Pushing a feed to the search appliance – Google Search Appliance Getting the Most from Your Google Search Appliance User Manual

Page 26

background image

Google Search Appliance: Getting the Most from Your Google Search Appliance

Crawling and Indexing

26

The following figure provides an overview of indexing hard-to-find content by using feeds.

Pushing a Feed to the Search Appliance

To push a content feed to the search appliance, you must provide the following components:

Feed—An XML document that tells the search appliance about the contents that you want to push

Feed client—An application or web page that pushes the feed to a feeder process on the search
appliance

You can use one of the feed clients described in the Feeds Protocol Developer’s Guide or write your own.
For information about writing a feed client, refer to “Writing Applications with the Feeds Protocol” on
page 69.

URL Patterns and Trusted IP lists that you define with the Admin Console ensure that your index only
lists content from desirable sources. When pushing URLs with a feed, you must verify that the Admin
Console will accept the feed and allow your content through to the index. For a feed to succeed, it must
be fed from a trusted IP address and at least one URL in the feed must pass the rules defined in the
Admin Console.

Push a content feed to the search appliance by performing the following steps:

1.

Adding the URL for the document defined in the Feed Client to crawl patterns by using the Content
Sources > Web Crawl > Start and Block URLs page. URLs specified in the feed will only be crawled
if they pass through the patterns specified on this page.