beautypg.com

Content statistics – Google Search Appliance Administrative API Developers Guide: Java User Manual

Page 28

background image

Google Search Appliance: Administrative API Developer’s Guide: Java

28

Detailed document status entry properties:

GsaEntry entry = myClient.getEntry("diagnostics",

"http://server.com/secured/test1/doc_0_2.html");

System.out.println("Collection List: " + entry.getGsaContent("collectionList"));
System.out.println("Forward Links: " + entry.getGsaContent("forwardLinks"));
System.out.println("Backward Links: " + entry.getGsaContent("backwardLinks"));
System.out.println("Is Cached: " + entry.getGsaContent("isCached"));
System.out.println("Document Date: " + entry.getGsaContent("date"));
System.out.println("Last Modified Date: " +

entry.getGsaContent("lastModifiedDate"));

System.out.println("Latest Serving Version Timestamp: " +

entry.getGsaContent("latestOnDisk"));

System.out.println("Currently In Process: " +

entry.getGsaContent("currentlyInflight"));

System.out.println("Content Size: " + entry.getGsaContent("contentSize"));
System.out.println("Content Type: " + entry.getGsaContent("contentType"));
System.out.println("Crawl Frequency: " + entry.getGsaContent("crawlFrequency"));
System.out.println("Crawl History: " + entry.getGsaContent("crawlHistory"));

Content Statistics

Retrieve content statistics for each kind of document using the contentStatistics feed.

Property

Description

Entry Name

The URL of the document.

backwardLinks

The number of backward links to this document.

collectionList

A list of collections that contain this document.

contentSize

The size of the document content.

contentType

The type of the document.

crawlFrequency

The frequency at which the document is being scheduled to crawl, with
possible values of seldom, normal, and frequent.

crawlHistory

A multi-line history of the document crawl including the timestamp when
the document was crawled, the document status code and description in
the following format:

timestamp

status_code

status_description

timestamp

status_code

status_description

For status code values, see “Document Status Values” on page 23.

currentlyInflight

If the document is currently in process.

date

The date of this document.

forwardLinks

The number of forward links for this document.

isCached

Indicates if the cached page for this document is ready.

lastModifiedDate

The last modified date of this document.

latestOnDisk

The timestamp of the version being served.