beautypg.com

Google Search Appliance Protocol Reference User Manual

Page 33

background image

Google Search Appliance: Search Protocol Reference

Request Format

33

If you want to filter languages other than the above, obtain the language code from ISO 639 (see

http://

www.loc.gov/standards/iso639-2/php/code_list.php

), index a document corpus containing the desired

languages, and run tests to determine that the search results are as expected.

Language Filtering for Traditional and Simplified Chinese

The search appliance determines the encoding of a search query and uses that encoding to return
search results. For example, if a user enters a search query using Traditional Chinese, the search results
are returned in Traditional Chinese. If a query is entered using Simplified Chinese, the results are also in
Simplified Chinese. The original encoding of the documents does not affect what is returned. If
documents encoded in Traditional Chinese are crawled and a Simplified Chinese query is entered, the
documents returned are encoded in Simplified Chinese.

However, if a search query uses characters that are common to both Simplified and Traditional Chinese,
the search appliance’s behavior is indeterminate. In some cases, the search appliance detects such
queries as Simplified Chinese, but in other cases, the language is detected as Traditional Chinese. One
example of a query that returns indeterminate results is the term Hong Kong. To resolve this issue, use
the lr parameter to specify whether you want to enforce Traditional Chinese (lang_zh-TW) or Simplified
Chinese (lang_zh-CN).

Spanish

lang_es

Swedish

lang_sv

Turkish

lang_tr

Language

Automatic Language Filter Name