beautypg.com

Crawling and indexing content sources – Google Search Appliance Getting the Most from Your Google Search Appliance User Manual

Page 8

background image

Google Search Appliance: Getting the Most from Your Google Search Appliance

Introduction

8

Providing Universal Search with a Google Search
Appliance

Your goal is to deliver universal search to your users. The two major aspects of providing universal
search with a Google Search Appliance are:

“Crawling and Indexing Content Sources” on page 8

“Serving Search Results to Users” on page 9

This section provides an overview of each of these aspects.

Crawling and Indexing Content Sources

The Google Search Appliance can crawl and index content from many sources, including:

File shares—Files in 220 different formats, such as HTML, PDF, Microsoft Office, and many more

Intranets—All files on your intranets or other web servers

Content Management Systems—Information in content management systems, with built-in
connectivity to EMC Documentum, IBM FileNet, Open Text Livelink, and Microsoft SharePoint

Enterprise applications—Information in your business applications, using Google’s OneBox for
Enterprise, which enables a search appliance to connect with enterprise applications, such as
Customer Relations Management (CRM) systems, Enterprise Resource Planning (ERP) systems, and
financial databases

Databases—Records in relational database management systems, including IBM DB2, Microsoft
SQL Server, MySQL, Oracle, and Sybase

World Wide Web—Information on the web

For more information about how the Google Search Appliance crawls and indexes different types of
content sources, refer to “Crawling and Indexing” on page 16.