beautypg.com

Overview of the gsa connector for file systems – Google Search Appliance Connectors Deploying the Connector for File Systems User Manual

Page 4

background image

Overview of the GSA Connector for File Systems

The Connector for File Systems enables the Google Search Appliance to crawl and index
content from Windows shares. A single connector instance can support a single Windows
share. The share can be a UNC path or a mapped drive.

The following diagram provides an overview of how the search appliance gets content from
the repository through the Connector for File Systems. For explanations of the numbers in
the process, see the steps following the diagram.


1. The Connector for File Systems queries the repository for a single DocId.
2. The repository sends the DocId to the connector.
3. The connector constructs a URL from the DocId and pushes it to the search

appliance in a metadata-and-URL feed. Take note that this feed does not include the
document contents.

4. The search appliance gets the URL to crawl from the feed.
5. The search appliance crawls the repository according to its own crawl schedule, as

specified in the GSA Admin Console. It crawls the content by sending GET requests
for content to the connector. If the content is in HTML format, the search appliance
follows links within the page.

6. The search appliance requests DocIds that it discovers during the crawl from the

connector.

7. The connector queries the repository for the requested DocIds.
8. The repository sends the DocIds to the connector.