Learn more about public crawl, Crawling and serving controlled-access content – Google Search Appliance Getting the Most from Your Google Search Appliance User Manual
Page 18

Google Search Appliance: Getting the Most from Your Google Search Appliance
Crawling and Indexing
18
If you prefer to have the search appliance crawl according to scheduled times, you must also perform
the additional following tasks by using the Crawl and Index > Crawl Schedule page in the Admin
Console:
1.
Selecting scheduled crawl mode.
2.
Creating a crawl schedule.
3.
Saving the crawl schedule.
To schedule crawling times for a specific host, you can change the host load and times in the Crawl and
Index > Host Load Schedule page. By setting a host load of 0, the crawler will not crawl that host
during the configured time period.
If you wish to have a document added to the crawl queue right away, then you can do so by entering in
the URL in Re-Crawl These URL Patterns on the Crawl and Index > Freshness Tuning page.
Learn More about Public Crawl
For in-depth information about public crawl, configuring a search appliance to crawl, and starting a
crawl, refer to the introduction in Administering Crawl.
For a complete list of file types that the search appliance can crawl, refer to Indexable File Formats.
Crawling and Serving Controlled-Access Content
Controlled-access content is secure content—it is restricted so that not all users have access to it. For
access to controlled-access content, users need authorization.
A search appliance discovers and indexes controlled-access content in the same way that it indexes all
other content: by performing a crawl through the content sources. However, the search appliance
requires access credentials to discover and index controlled-access content. Once you set up the search
appliance with access credentials, it maintains a copy of all crawled content in the index.
The following figure provides an overview of crawling controlled-access content.