Created by
Valeria Gallo Stampino--
November 6th, 2006



Google

 

Of all search engines Google has one of the biggest indexes--though the exact size is unknown and the company is not currently issuing official estimates on this matter (Battle par. 3). Google covers indexed web pages as well as PDF, .ps, .doc, .xls, .txt, .ppt, .rtf, .asp, .wpd, and more. In addition, Google spider crawls non-indexed pages. The unindexed URLs can be identified in Google's results because they lack an extract, page size, and there is no cached copy of the page (Notess, Google Special Report par. 2).

 

For determining the order of search results, Google uses a system called PageRank™, which ranks web pages. Though the algorithm is proprietary and a business secret, publicly available information suggests that the basis of PageRank is analyzing a page link structure as an indicator of an individual page's value (Google Technology).

 

A factor presumably influencing a page ranking is the existing rankings of other pages linking to it. Pages with a higher rank will increase the rank of those pages they link to. In addition, Google also looks at the frequency with which a search term appears on a page, the order of the search terms, and assigns phrases a higher rank than it does to keywords.

 

Finally, according to the company, “Google goes far beyond the number of times a term appears on a page and examines all aspects of the page's content […] to determine if it's a good match for your query”. Though it is unclear what exactly all aspects of the page's content are and to which extent these are examined.

 

Search logic and syntax

  • Default Boolean AND

  • Boolean OR is supported but it must be capitalized

  • Supports some nesting of Boolean operators

  • Phrase searching available using quotation marks

  • Search terms are not case sensitive

  • Automatic stemming; the use of stemming is not explicit on search results page; can be disabled by enclosing a word in quotation marks

  • Spell checking

  • Stop words: words omitted by Google appear at the top of the screen on the search results page. These words can be included by adding the "+" sign in front of it.

  • Minus symbol “-“ can be used to exclude a word or phrase

 

Display of results

 

Google shows document title, first few lines of text, page URL, size of page in kilobytes, link to cached version and “similar pages”. The “similar pages” feature in Google does not always display results that are thematically related. Because the similar pages are retrieved based on back linking, this feature will also display pages that are not thematically similar but that link to the same pages (SearchEngineWatch forums).

 

Figure 1 Display of search results in Google

 

 

Limits: Language, country, domain, date, some field searching. In addition, Google offers other search features, though they might not always work as advertised. For example, Google’s “link:” function is supposed to retrieve all pages linking to a particular site, but it only retrieves a portion of them.


Search strategy for Google

 

"electronic collection development policy" OR "collection development policy for electronic resources"

 

Top Five Results in Google*

 

Electronic Collection Development Policy>


Electronic Collection Development Policy. I. INTRODUCTION. 1. Purpose The E-Library Collection Development Plan states the principles and guidelines that ...
www.ncu.edu/elrc/policy/collection_dev.asp - 35k - Cached - Similar pages

Relevance 4

WelchWeb: Collection Development Policy for Electronic Resources


Collection Development Policy for Electronic Resources. INTRODUCTION; VISION FOR THE DIGITAL LIBRARY; SCOPE; SELECTION CRITERIA; MULTIPLE FORMATS AND COPIES ...
www.welch.jhu.edu/about/ecdpolicy.html - 25k - Cached - Similar pages

Relevance 4

Electronic Collections Development


Collection Development Policy for Electronic Resources (September 24, 1998); This well-constructed policy begins by stating clearly that “[w]hen possible, ...
www.library.yale.edu/~okerson/ecd.html - 50k - Cached - Similar pages

Relevance 5

Collection Development Policy for Electronic Resources


Collection Development Policy for Electronic Resources. Introduction Purpose of this Policy Scope Selection Criteria Multiple Formats and Multiple Copies ...
new.ahsl.arizona.edu/policies/cdpolicy.cfm - 15k - Cached - Similar pages

Relevance 4

[Publib] Electronic Collection Development Policy


[Publib] Electronic Collection Development Policy ... Our library is working to establish an Electronic Collection Development Policy. ...
lists.webjunction.org/wjlists/publib/2006-March/096885.html - 4k - Cached - Similar pages

Relevance 2

 

 

All five results in Google refer to the topic of digital collection development in libraries. Results number 1, 2, and 4 constitute examples of policy statements from American academic libraries, therefore their relevancy level is high. The third results was assigned the highest level of relevancy because this web page includes links to sample policies and other useful resources. Finally, result number 5 was the least relevant. This last result is simply a message sent to a listserv in which somebody is asking for information on the topic rather than offering it.

 

*Note: This search results were recorded on October 28th, 2006.

 


next>>