Crawl urls – Google Search Appliance Administrative API Developers Guide: .NET User Manual
Page 8

Google Search Appliance: Administrative API Developer’s Guide: .NET
8
•
Crawl URLs
Retrieve and update crawl URL patterns on a search appliance using the crawlURLs entry of the config
feed.
Retrieving Crawl URLs
Retrieve information about the URL patterns that the search appliance is crawling as follows:
// Send a request and print the response
GsaEntry myEntry = myService.GetEntry("config", "crawlURLs");
Console.WriteLine("Start URLs: " + myEntry.GetGsaContent("startURLs"));
Console.WriteLine("Follow URLs: " + myEntry.GetGsaContent("followURLs"));
Console.WriteLine("Do Not Crawl URLs: " + myEntry.GetGsaContent
("doNotCrawlURLs"));
Updating Crawl URLs
Update the crawl URL settings on a search appliance as follows—in the example that follows,
example.com is requested for crawling, and spreadsheets are requested to not be crawled.
// Create an entry to hold properties to update
GsaEntry updateEntry = new GsaEntry();
// Add a property for adding crawl URLs to updateEntry
updateEntry.AddGsaContent("startURLs", "http://www.example.com/");
updateEntry.AddGsaContent("followURLs", "http://www.example.com/");
updateEntry.AddGsaContent("doNotCrawlURLs", ".xls$");
// Send the request
myService.UpdateEntry("config", "crawlURLs", updateEntry);
Property
Description
doNotCrawlURLs
Do Not Crawl URLs with the following patterns, separate multiple URL
patterns with new line delimiters.
followURLs
Follow and crawl only URLs with the following URL patterns, separate
multiple URL patterns with new line delimiters.
startURLs
Start crawling from the following URLs, separate multiple URL patterns
with new line delimiters.