How many urls can be crawled, How do i control security, How can my security model improve performance – Google Search Appliance Planning for Search Appliance Installation User Manual
Page 12

Google Search Appliance: Planning for Search Appliance Installation
12
How Many URLs Can Be Crawled?
The number of URLs that your search appliance can crawl depends on the model and license limit. The
follow table lists the maximum number of URLs matching the crawl patterns you define that the search
appliance can crawl.
How Do I Control Security?
Your business may require you to restrict access to certain enterprise content. You might want to
restrict what content is crawled and indexed, and you might want to restrict which users have access to
particular content. The Google Search Appliance supports various security models:
•
You can exclude content from the index by storing the content in locations that are not crawled.
•
You can exclude content from the index by using a robots.txt file to prevent particular locations
from being crawled.
•
You can require the search appliance to provide credentials before crawling particular locations.
•
You can design an authentication model under which users who cannot be authenticated are not
able to see particular content.
•
You can design an authorization model that defines which users are authorized to perform certain
functions on particular documents.
The search appliance supports a range of authentication and authorization methods, including HTTP
Basic, Windows NT LAN Manager Authentication (NTLM), HTML forms-based authentication, certificate
authentication, lightweight delivery access protocol (LDAP) directory servers, Authentication and
Authorization SPI.
For information on how to configure crawl for your security model, see Administering Crawl. For
information on how to integrate your search appliance with different authentication and authorization
models, see Managing Search for Controlled-Access Content.
How Can My Security Model Improve Performance?
Using policy ACLs and per-URL ACLs to control which users have access to content located in particular
URLs speeds up the process of authorization and improves search appliance performance. For more
information on ACLs, see Managing Search for Controlled-Access Content.
Search Appliance Model
Maximum License Limit
Maximum Number of URLs
that Match Crawl Patterns
GB-7007
10 million
~ 13.6 million
GB-9009
30 million
~ 40 million
G100
20 million
~26 million
G500
100 million
~133 million