Google Search Appliance Deployment Governance and Operational Models User Manual
Page 13
![background image](/manuals/552827/13/background.png)
13
Consideration
Comments
Users and user types affected by
the content integration
Aim for platforms that have the highest impact first.
Ease of integration
In order of increasing difficulty of integration: direct crawl, connector
available, crawl through proxy, feed needs to be developed,
connector needs to be developed.
Security authentication and
authorization mechanism
required
Systems that integrate with security mechanisms that are directly
supported by the GSA provide the easiest integration.
Systems that require custom auth integration are more difficult to
integrate.
Search front end customization
required to provide the most
value to the search experience
of the platform
There are times when providing the most value out of content
integration with the GSA requires some front end modifications to
expose such things as:
●
filters
●
categories
●
metadata in search results
●
custom navigation processes
In order of increasing difficulty of integration: default XSLT, minor
XSLT modifications, major XSLT modifications, custom application
parsing and displaying the XML provided by the GSA.
Maturity of metadata available in
the content source
Although metadata is not required for indexing, it can help in terms
of enriching the content in the index and by giving you more options
for shaping the search experience.
Some content sources, by their nature, have metadata available
“out-of-the-box,” while other sources require adhering to a process
at publish time.
Also consider augmenting documents with metadata at index time
programmatically through a custom process, if desired.
Augmenting document metadata
through Entity Recognition
The Entity Recognition feature of the GSA can be used to enrich
content with entities extracted through the definition of entity rules
based on dictionaries or regular expressions.
These
rules,
when defined, will tag the documents with metadata at
index time. This feature can be used to assign metadata to
documents, which may otherwise not be tagged with metadata.