The Google Quality Raters Handbook
The documents are used by Google Quality Raters to aid them in classifying queries, measuring relevancy, and rating the search results. To do so, the Quality Rater must understand how Google works and this document has a bunch of that. Let me pull out some of those details in easy to read bullet points.
Three Query Types:
- Navigational: someone searching for a site, such as a search for IBM.
- Informational: someone searching for information on a topic of interest, such as finding out more information on Danny Sullivan.
- Transactional: someone searching when seeking to purchase something either online or offline, such as searching for ‘buy ipod touch.’
Quality Rating Scales:
- Vital: This is the highest score a web page can receive for a query. A vital result comes from a query that is most likely navigational and the resulting page is the official web page of the query. When searching for ‘ibm’, the vital result would be www.ibm.com.
- Useful: This is the second highest score a web page can receive for a given query. A useful rating should be assigned to results that “answer the query just right; they are neither too broad nor too specific.” One of the examples given for a useful rating would be a search on meningitis symptoms with a resulting web page of http://www.webmd.com/hw/infection/aa34586.asp
- Relevant: This comes after a useful rating, and is used for results that return less useful results. The guidelines say the result is often “less comprehensive, come from a less authoritative source, or cover only one important aspect of the query.” An example would be a review of laptop computers, but the review only takes five computers and not all computers within its class. Since it is not a fully comprehensive review, it would be rated as relevant and not useful.
- Not Relevant: This rating is used for pages that are not helpful to the query but are somewhat still connected to the original query. Classifications of a not relevant page would be “outdated, too narrowly regional, too specific, too broad” and so on. One of the examples give is a search for the ‘BBC’ that returns a specific article from BBS; it is too specific and is not relevant to the query at hand.
- Off-Topic: This is the lowest rating a page can receive for a query. If the returned page is completely not relevant to the query, it would be given a rating of “off topic.” An example given is a query on ‘hot dogs’ that returns a page about doghouses.
Categories For Results That Can’t Be Rated:
Not everything can be rated, and those must be classified somehow. The categories for those types of results include:
- Didn’t Load: For pages that return a 404 error, page not found, product not found, server time out, 403 forbidden, login required, and so on.
- Foreign Language: This is given to a page that is in a “foreign language” to the “target language” of the query. English is never a foreign language, no matter what. So, if you search in Chinese for something and a Hebrew page is returned, it is a foreign language, but if an English page is returned, it is not a foreign language. There are exceptions to the rule.
- Unratable: When the rater cannot rate it for any other reason.
Now for the really good stuff, spam labels. This is a new addition to the quality raters guidelines and is fairly small. The labels include:
- Not Spam: The not spam rating is given to pages that “has not been designed using deceitful web design techniques.”
- Maybe Spam: This label is given when you feel the page is “spammy,” but you are not 100% convinced of that.
- Spam: Given to pages you feel are violating Google’s webmaster guidelines.
Flags are for pages that require immediate attention, such as:
- Pornographic content
- Malicious code on pages
That is a brief overview of some of the many points in the document. For more, see the archived document and for some history, check out Google Blogoscoped. Here is an additional copy of this document at Huomah.com.