Created by
Valeria Gallo Stampino--
November 6th, 2006



Gigablast

 

According to the company, the size of their databases reaches 200 billion full pages and 100,000 servers (Features). However, others maintain a different opinion on Gigablast’s index. Though the company claims that “URLs are indexed in real-time and link analysis is done on the fly”, according to Greg Notess, Gigablast has a “small database and [is] not refreshed as frequently as others (Gigablast par. 4). This statement conflicts with the company’s claim of having one of the “freshest indexes in the world”.

 

Like Google and Ask.com, this search engine also indexes PDF, MS Word, Power Point , Excel and Postscript documents. Gigablast indexes all "generic" meta tags, beyond just the meta description and meta keywords. It also displays meta tags in the list of search results (Notess par. 15). This practice is currently not used by many of the competitors search engines because of how easy it is for spammers to manipulate this.

 

Gigablast also offers a custom topic search. But, what the company actually offers is the ability to pre-select up to 500 web sites and a search box to search only these sites. This is not exactly a ‘custom topic search’ since the ‘aboutness’ of the search is not defined; instead, sources that are more likely to contain information on a given topic are used, which may or may not improve relevancy.

 

The company claims to “custom weight the query terms exactly how you want”. It is unclear how this can be done.

 

Search logic and syntax

  • Default Boolean AND

  • Boolean OR is supported but it must be capitalized

  • Nesting of Boolean operators available using parentheses,  but limited

  • No truncation available

  • Phrase searching available using quotation marks

  • search terms are not case sensitive

  • Spell checking

  • Stop words: common words such as  'the', 'of', 'and', 'it,' 'is,' and 'or' are ignored. A plus "+" sign preceding the word disables this feature.

  • Minus symbol “-“ can be used to exclude a word or phrase

  • Query refinement: Example: john | smith. “Gigablast performs a search for john and then another search is done on just those results for smith. So all results have both john and smith but they are scored solely by smith.”

  • Cached pages and links to the Wayback Machine (Notess par.3)

 

Limits: domain, title, link, site, ip, file type.

 

Display of results: document title, keyword-in-context extract, URL, file size, link to cached pages, date crawled, last modified date, meta tags (optional display). Gigablast also offers links to Related Topics and Related Pages.

Figure3 Display of search results in Gigablast


This search engine gives a better ranking to those sites that feature a link to gigabalst.com. Also, if a web page ties in rank with another web page, the page linking to Gigablast.com is given the higher rank (Gigaboost). Though it is understandable that as a newer and smaller company, Gigablast needs to use strategies to compete with the bigger players, this strategy results in biased and less relevant search results.

 

Search strategy for Gigablast

 

Because Gigablast searches a smaller portion of the web than its competitors, searching for a long sentence was not as successful a strategy. The crawler was retrieving the sentence from non-relevant contexts, such as the bibliography of a paper in Spanish. For this reason, a slight modification was made, instead of using:

 

"electronic collection development policy" OR "collection development policy for electronic resources"

 

 

The following combination was used:

 

electronic "collection development  policy"

 

By leaving ‘electronic’ out of the phrase, the engine has more flexibility of ANDing the concepts in any order.

 

Top Five Results in Gigablast*

 

ALA | Electronic Resources Bibliography
... have a written collection development policy for electronic materials ...
collection development policy statement for electronic information ... an
electronic information collection development policy statement". ...
www.ala.org/ala/rusa/rusaourassoc/rusasections/codes/codessection/
codescomm/colldevpolicies/electronicresources/Electronic_Resources_Bibliography.htm - 40k - modified: Apr 07 2006

Relevance 5

Electronic Resources Collection Development Policy
Electronic Resources. Collection Development Policy. Statement of ... Electronic
resources are cataloged as part of the library collection ... Guidelines for
purchase of electronic materials set by the Western ...
bullpup.lib.unca.edu/genproced/colldev_eres_pol.html - 7k - modified: Oct 07 2004 –

Relevance 5


University of Alabama Libraries,Collection Policy for Electronic Resources
... guidelines as described in the collection development policy as well ...
environment of electronic resources, each electronic resource will be ...
information environment.. *Electronic resources should fall within ...

Relevance 5

elresman.htm-ERC Development Policy
Electronic Resources. Collection Development Policy. I. Selection ... Collection
Development Policy
. 1. Selection. 2. Requesting New ... Electronic materials
billed to the Bursar's Office will be subject to ...
library.hartford.edu/llr/elresman.htm - 15k - modified: Apr 25 2006 -

Relevance 5

Electronic Resource Collection Development Policy - Texas State Library and ...
Electronic Resource Collection Development Policy   TEXSHARE ... The TexShare
Electronic resources are selected through recommendations ... TexShare
electronic resources are supplemented through the selection ...
www.texshare.edu/programs/academicdb/collectionpolicy.html - 29k - modified: Mar 16 2006

Relevance 4

 

The first result, the ALA document, is relevant because it includes an annotated bibliography on electronic collection development issues and it comes from a reputable source. Results 2, 3, and 4, are all relevant because they are samples of policy statements from North American academic libraries.

 

Finally, the last result received the lowest relevancy ranking because although it is on the desired topic, it does not apply exclusively to academic libraries but to other types of libraries in the state of Texas. It must be acknowledged that the requirement of limiting the search to academic libraries was not specified on the search terms used, so the retrieval of documents related to other types of libraries should not be a surprise. 

 

*Note: This search results were recorded on October 28th, 2006.

 


next>>