Crawl and index for controlled-access content – Google Search Appliance Managing Search for Controlled-Access Content User Manual
Page 9
![background image](/manuals/552809/9/background.png)
Google Search Appliance: Managing Search for Controlled-Access Content
9
After the search appliance authenticates a user by establishing the user’s identity, the search appliance
performs authorization checks to determine whether a user has access to the secure content that
matches their search. For detailed information about authorization on the Google Search Appliance, see
“Authorization” on page 38.
A Google Search Appliance provides additional methods for enabling authentication and authorization
that do not require user impersonation. These are discussed in “The SAML Authentication Service
Provider Interface (SPI)” on page 31 and “The SAML Authorization Service Provider Interface” on
page 47.
This chapter provides information on how to configure the Google Search Appliance to crawl, index, and
serve controlled-access content. For examples of configuring a search appliance, see “Use Cases with
Public and Secure Serve for Multiple Authentication Mechanisms” on page 53.
Crawl and Index for Controlled-Access Content
The Google Search Appliance indexes all content that can be crawled and indexed. This includes both
controlled-access content and content that is available to anyone. Once you set up the search appliance
with access credentials, it will maintain a copy of all crawled content in the index. The index allows the
search appliance to determine relevance and display secure results when a user performs a search.
Users only see the secure results that they are authorized to view.
How a Search Appliance Indexes Controlled-Access
Content
A search appliance discovers and indexes controlled-access content in the same way that it indexes all
other content: by performing a crawl through the content sources that are available to the web crawler,
file system crawler, relational database crawler, and the XML content feed interface.
When you define content sources, you must perform additional steps in the Admin Console to give the
search appliance access to controlled-access content:
1.
Provide the search appliance with URL patterns that match the controlled content.
2.
Give the search appliance access credentials to use with those patterns.