Listing crawled documents – Google Search Appliance Administrative API Developers Guide: Protocol User Manual
Page 36

Google Search Appliance: Administrative API Developer’s Guide: Protocol
36
Listing Crawled Documents
Query parameters:
To list documents, send an authenticated GET request to root entry of diagnostics feed.
http://Search_Appliance:8000/feeds/
diagnostics?uriAt=http%3A%2F%2Fserver.com%2Fsecured%2Ftest1
Returns a description entry, a set of documents status entries and a set of directories status entries.
Description entry properties:
26
Unhandled content type
27
No filter for content type
34
Robots.txt forbidden
Parameter
Description
collectionName
Name of the collection that you want to list. The default value is the last
used collection.
flatList
false: List the files and directories that directly belong to an indicated URI.
true: List all files starting with an indicated URI as a flat list. The default
value is false.
negativeState
false: Just return documents with a status that is equal to view. true : Just
return documents with a status that is not equal to view. The default value
is false.
pageNum
The page you want to view. The files from a URI may be separated into
several pages to return. The page number starts from 1. The default value is
1, the first page.
sort
The key field of sorting. host: sort by host name, file: sort by file name,
crawled: sort by crawled doc number, errors sort by errors number,
excluded sort by excluded doc number. The default value is "".
uriAt
The prefix of the URI of the documents that you want to list. If not blank, it
must contain at least http://hostname.domain.com/. The default value is
"".
view
A filter of the document status. The values of view are described in the
section “Document Status Values” on page 34. The default value is all.
Property
Description
<Entry Name>
description
numPages
The total number of pages to return.
uriAt
The prefix of the URL taken from the query parameters.
Value
Description