An Interview With A Google Search Quality Rater

google-quality-raterSince at least 2005, Google has been using a large, worldwide focus group to help review its search results and the quality of the web pages that rank well in its algorithm. The people in this program are called Quality Raters and, as you can imagine, the work they do is important to search marketers everywhere.

Google was actually advertising Quality Rater jobs in late 2004, but today the Quality Raters don’t actually work for Google; they work for contractors such as Lionbridge, Leapforce, Butler Hill and possibly others. According to Lionbridge’s Internet Assessors Program job page, it has more than 4,500 people around the world rating search results. Leapforce’s website doesn’t indicate how many are in its program, but the job listings page includes opportunities with names like “Search Engine Evaluator,” “Social Search Engine Evaluator” and “Search Quality Judge.”

The Quality Raters’ work has become more widely known over the years thanks to a couple occasions when the guideline document that Google provides as part of their work has been leaked online. (See our posts in March 2008 and October 2011.) Webmasters have also noticed unique quality rater referral strings, indicating when one of the evaluators had visited a website.

After Jennifer Ledbetter posted about the program last fall, one current Quality Rater contacted Search Engine Land wanting to explain and clarify some of what’s been written and said about the program. Since then, with a couple breaks for holidays, I’ve traded numerous emails with this person … who, in addition to working for Lionbridge as a Quality Rater, also happens to work for a US-based search marketing agency.

To help ensure that this person, whom I’ve never met, is actually a Quality Rater, I asked for some screenshots from inside the website where the rating work is done. A couple of those are inserted within the interview, and here’s an image of the rating tasks home page showing an empty task queue.

rating-tasks

Below, we talk about the hiring process, what Quality Raters look for when they examine websites, details of the different evaluation tasks they do and much more.

Q&A With A Google Search Quality Rater

SEL: Tell me how, when and why you got started with the Quality Rater program.

Quality Rater: I first started with Lionbridge in May of 2011. I was looking for work because my then current employer had told me I was taking a pay cut, so I needed a way to add income. I began searching all the normal places for job listings and came across one on Craigslist for a Quality Rater. It sounded cool, so I sent them my resume and they got back to me the next day saying they were excited to have me and if I could just pass a few simple tests I would be hired. That was the easy part.

Did the job listing specifically mention Google?

The listing didn’t mention anything about Google but as soon as they contacted me, they said I would be doing work related to Google.

So, you knew it was Google-related. At what point did you know that you’d be rating Google’s search results?

I knew before I got hired.

One thing I think the SEO community is missing is that this program has nothing to do with SEO or rankings. What this program does is help Google refine their algorithm. For example, the Side-by-Side tasks show the results as they are next to the results with the new algorithm change in them. Google doesn’t hire these raters to rate the web; they hire them to rate how they are doing in matching users queries with the best source of information.

Let’s talk about the hiring process. There’s some kind of test. Was it difficult?

I had six days to complete both parts of the test, with the second part opening after I passed the first test.

The tests turned out to be a 24-question, essay-response theoretical test that asked questions based on a PDF they had sent me. The questions were designed to test my ability to take the rules and apply them to situations that weren’t covered in the PDF. One that I vaguely remember was about spam and what to do if the site didn’t show any signs of spam, but it gave off a spammy feeling. It was the hardest test I have ever taken (for a reference point, I’m a Literature major who has taken graduate-level courses).

Only after having passed that test did I get to take the practical exam, which had more than 140 questions. This test had actual results that I had to rate. In order to be hired, I needed to score a 90% or higher in each of the four categories (which were Vital, Useful, Relevant and Off-Topic or Useless). Ideally, these represented the actual tasks that I would receive as a rater.

What were the questions like?

To give you example of questions asked:

Query [crispy cream], English (US)
URL: http://www.treblebooster.com/

It would then be up to me to visit the page — something that I want to stress, because blogs out there have been saying that a rater can rate the page without visiting it — decide if it fits the query and then assign a rating. It really is up to the rater, but the correct answer here is Useful because of the spelling. If the user had typed “Krispy Kreme,” than this result would be off-topic, but because it is “crispy cream,” and the guitars on this page are called Crispy Cream, this could be the page the user is wanting.

There were 143 just like that. It was good times.

Do you have any direct contact with anyone at Google, or do you only communicate with Lionbridge?

I have no contact with Google; it’s only Lionbridge.

After you get hired, is there some kind of training?

After I got hired there was a weekly, two-hour webinar along with training modules to complete. It was very intense training. During the first four weeks, I was required to comment on every rating I gave. These comments were then reviewed and commented on, giving me feedback on my ratings.

At what point do you get the raters’ handbook?

I got this the moment I got hired.  It basically is just a list of tasks we perform along with examples of how to rate them.

How does Lionbridge (or Google) describe the handbook?

They refer to it as the guidelines, not a handbook.

While we are on the subject of guidelines, one thing that really impressed me was how they have more than one rater looking at a site. I believe (I’m not sure, I’m going off the comments left by other raters) that there are about six raters looking at each task. If I rate something as useful but another rater says it’s off-topic, we must come to an agreement (through comments and debate) before the rating is submitted.

How much do you make and how often do you get paid?

I get paid $14.50/hour and I am paid once a month. I’m only able to work a max of 20 hours a week and a total max of 80 hours a month.

quality-rater-home

In one of the recent articles about the Quality Raters, it says you can only work for a year and then you have to wait three months before you can re-apply. Is that true?

I know they say you can only be a rater for a year, but everyone I’ve talked to says that, as long as they get their hours in and keep up the quality they are allowed to rate.

Is the schedule completely up to you, or do they give you assigned hours?

I schedule my own hours; as long as I get at least 10 but no more than 20, I stay on pretty good terms with them. They are very strict, but allow you to make up hours that you missed. So, if I only did four hours the first week, I could make up the hours by doing 16 hours the next week. Still only allowed 20 hours a week max, so if I miss more hours than I can make up, I’m out of luck.

They also tend to be really strict about their productivity goals. There is a certain number of tasks that I must complete every minute, depending on the task type. If I fall short of those goals, I am put on probation, during which I can not work. If my quality isn’t up to par, they fire me. It’s a very controlled work environment.

You mentioned there about getting fired “if my quality isn’t up to par.” How do you know if you’re doing a good job? It seems to me that in a lot of cases, rating search results is pretty subjective.

Results are subjective, but they have a quality center that shows your progress over time. They track how many returned results you have, how long it takes you to take care of a troubled rating, etc. While the rating is up to me, it has to be similar to what other raters have said. So, they track quality based on staying within the time period for rating tasks and the number of tasks you have returned to you.

They return tasks to you — what does that mean?

It means that there has been a disagreement on the rating and you have to go back in and come to an agreement with the other raters.

So, the rating of search results is a group project. Is it difficult to come to agreement?

Sometimes it’s harder to agree with raters, especially if they haven’t read the guideline like they should or if they are just starting out. However, after enough exchanges, they have a moderator come in and choose which rating matches it best. This moderator looks at our comments and makes a decision off of that.

How often does that happen in your experience?

Not very often. Most of the time if you give your reasoning for why you rated something one way, the other raters will agree with you. Most of the time, these types of disagreements occur when something is either slightly relevant or off-topic. Once in a while, someone will think that a page is spam that isn’t, or the other way around. I’ve only had a moderator step in once.

What do you know about the moderators? Are they Lionbridge employees?

Yes, they work for Lionbridge. From what I know of them, they used to be raters and then got promoted.

Do you only look at organic results, or are you also grading ads/PPC landing pages?

We look at any type of page on the web. Most of them are organic results, but some of the tasks are geared towards more ad-related topics.

Do you remember an example of an ad-related task?

Not really. Most of what they were was placement on the page, order in which they are presented and which one would I click, etc.

Do you look at Google Places results and other Universal results, like News or Videos?

Yes, we do. I can think of many tasks where it shows the map of what a user was looking at before they typed in a query, and we are then to rate the results of that query based on the map they were looking at. We also rated news based on how current it was, how relevant it was to the query, and if it came from a trustworthy source. As for videos, we had to watch the video to determine if it was a match for the query and rate it Useful, Relevant, Slightly Relevant, or Off-topic.

That part about Maps is really interesting. So, in that task, they were putting you in the middle of some process — you’re not just doing tasks that involve standalone searches, but sometimes taking into account what has happened before? Does that also happen with other searches, too?

Almost all of the tasks given have to do with user experience. Even with just the basic searches, we are given the user’s language and location before we can rate a page. It’s not about if a page fits a query, it’s about if a user would find the page useful. The Maps queries (called local queries) are the only ones that give what the user was looking at before searching, but we are supposed to keep in mind what a user is expecting to see from that query with every task type. For example, if someone was in Seattle and typed in the query “weather,” they would find a page showing the weather in Florida slightly relevant; however, someone in Tampa would find it useful.

Aside from the collective rating that you described above, do you ever have other communication with other raters? Are there official or unofficial places where you can chat back and forth?

There are lots of places — forums and such on the Lionbridge site — where raters can talk to each other, but I never interact with them. I was always stressed getting my hours in for the week, so I didn’t have time to mingle.

Can you share a specific example of one of your recent tasks?

I can’t think of the exact URLs I rated, but the keyword was “Nike Women’s Running Shoes.” It gave me a list of 20 URLs to rate (10 on each side) [Ed. note: he's referring to the "Side-by-Side" tasks mentioned earlier.] and I visited each one in order to determine whether they were vital, useful, relevant, slightly relevant, or useless. With a recognized brand name like that, it wasn’t hard to determine quality. For example, I think the Nike site was one of the options, so that would get a “vital” rating. I remember a couple of sites sold the shoes, so I gave them a “useful” rating and the Wikipedia entry on Nike was giving a rating of “slightly relevant” because I believe not many people searching for Nike Women’s Running Shoes want a history of the company.

Do you click through and review all ten results that show up for a given task?

I always click all the links simply because I’m not good enough to tell what the site is about by just reading its description. No one is good enough, that’s why they give us the links.

When you click through from a Google search result page, what are you looking for on the web page that you visit?

When looking at a site, I always check for spam signals first — keyword stuffing, hidden text, sneaky redirects, and the like. Once I know it’s a good site, I start to look at the page as a person who would type the query in Google and whether or not the content on the page would help me fulfill my needs. There are some tasks that ask about design and layout and the like, but for the normal URL rating or Side-by-Side tasks, I really just look at content and figure out if it would be a worthwhile page for a user to see.

Do you ever look at the source code or anything like that? Are Raters asked or trained to look at source code of the web pages being rated?

There is a quick primer on looking at the source code in the guidelines, nothing in depth. Basically we look for hidden keywords and other spammy tactics discussed in the guidelines.

You mentioned URL rating tasks and Side-by-Side tasks, but also some that involve design and layout. What are those tasks like?

Design tasks ask if the page has a good ratio of main content, supplemental content, and ads. It also asks about the overall design, is it easy to read, clear communication of information, and the like. It’s not about whether the page is beautiful or amazing, but whether or not the normal user could find what they need on the page without getting lost.

Do they give you a single web page and ask you to rate its design, or are you still going through a page of search results and then rating design?

They are specific tasks, not part of rating a URL.

Are spelling and grammar part of the design-based tasks?

Spelling and grammar are something we look at in all tasks (at least I do) but there’s not a ding for it.

When looking at design and layout, do your criteria change based on the type of site you’re looking at? For example, a web page on a big brand site might be expected to have a more professional design than some small business sites.

Like I said before, it’s more about the layout than the actual design. A company with a simple design would be rated just as well as a big company with a professional design as long as the information is clear and presented in a way that is easy to understand. To give you an example, a page where you can tell what the main content is with ads taking second page in the design would get a high rating. A page where the ads are confused with the main content, where you can’t tell the difference between content and ads would get a low rating.

How many different kinds of tasks are there? The guidelines I’ve seen begin by saying “you will work on many different types of rating projects.”

There are a lot of different tasks but they are all grouped under four main groups: URL, Side-by-Side, Experimental, and Result Review.  The big one there is the Experimental tasks which have a ton of different types of tasks in them. I’ve included a picture that lists all the task types and how long they are supposed to take, as well.

tasktypes

What are “Display Block” and “TTR” tasks?

Display Block, if I remember right, is a block of images that we rate as a whole rather than one at a time. TTR stands for Time to Rate, which is the baseline task they use to determine how long it should take to get a task done. It has all the different tasks in it, but instead of looking for accuracy it just cares about time.

Do they try to give you tasks related to topics and things you know about, or do you review pages about things you’re not very familiar with?

If someone types in “Best Dog Food for Puppies,” it’s not very hard to know what they are wanting and most queries have a fairly obvious meaning. However, once in a while I’ll get one that I can’t figure out and that’s when I do research to figure out what they want. For example, if someone queried “Release Liner,” I would need to do some research to figure out that it’s something used in cutting vinyl for signs and the like. At that point, I could determine whether a site is worthwhile or not. Granted, it’s not a perfect system but it works most of the time.

Are there specific industries/niches that show up more than others in your rating tasks?

Not that I have noticed.

How does your work affect Google’s search results — do they tell you anything about that?

They don’t talk about that; however, I know that what it really does is perfect the algorithm instead of changing actual live search results. I gathered this from the way that Side-by-Side are the most important tasks because they show the old algorithm versus a change in the algorithm that they are testing.

Are you an active Rater these days? How long do you think you’ll keep doing it?

I still rate on the weekends. I like doing it, so I’ll keep doing as long as I can.

Does Lionbridge and/or Google know that you work in the search marketing industry?

No. I got this job after I got the Lionbridge job.

Do you know of any other search marketers who are also Quality Raters?

I don’t know any personally, but I bet there aren’t a lot of us.

What’s your opinion of Google’s search results, and has that opinion changed since you became a Quality Rater?

I’ve always used Google as my “go to” search engine; however, since I became a rater, I’ve started using it more because I can see the behind-the-scenes improvements they are trying to make.

I like the idea that they have an army of actual people working towards bettering their engine. I know some people might think this wrong or even that raters have a negative effect on their rankings. Well, I can honestly say that they don’t. The whole point behind quality raters is not to rate the actual web, but rather rate how well Google is doing at providing quality results.

Almost every company has some form of quality control. Do people get upset that McDonald’s has someone check the quality of their food? I don’t see what Google does as any different than wanting to present the best possible product they can to their users.

So, to answer your question, yes, my opinion has changed for the better.

Related Topics: Channel: SEO | Features: General | Google: Search Quality Raters | Google: SEO | Google: Web Search | SEM Industry: General | SEO: General | Top News

Sponsored


About The Author: is Editor-In-Chief of Search Engine Land. His news career includes time spent in TV, radio, and print journalism. His web career continues to include a small number of SEO and social media consulting clients, as well as regular speaking engagements at marketing events around the U.S. He recently launched a site dedicated to Google Glass called Glass Almanac and also blogs at Small Business Search Marketing. Matt can be found on Twitter at @MattMcGee and/or on Google Plus. You can read Matt's disclosures on his personal blog.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://durak.org/sean/ sean dreilinger

    when i saw ads for this role on craigslist, i thought they might have been posted by one of the lesser PPC vendors attempting to crowdsource the generation of specious clickthrough on search ads under the guise of quality control.

  • http://metricvoodoo.com Michelle Moore

    Way to capitalize on a contractor’s flirtation with NDA violations…

    If site owners would simply listen to those of us who give advice and offer strategies based on the fact that we ARE raters but are prohibited from just handing out that info, I guess you wouldn’t be so tempted to blow open things like the raters guidelines and describing exactly how we help refine the algorithm. Alas, we’ve been telling people since the advent of search engine ranking to WRITE FOR YOUR AUDIENCE – DESIGN FOR YOUR AUDIENCE – stop trying to game the system and it WILL work!

  • http://www.williamrock.com William Rock

    Matt McGee great article, I am sure this person you reviewed had to hold some things back however a lot of good content was captured. One statement that hit home for me was really the quote

    “I like the idea that they have an army of actual people working towards bettering their engine.”
    We have seen major progress since 2001 and on with all the changes however 2011 and 2012 has topped them all as far as algo changes including now Google+

  • http://txtbrkr.com T.B.

    I think this is interesting from a usability and a content perspective. I thought this was interesting:

    “Are spelling and grammar part of the design-based tasks?
    Spelling and grammar are something we look at in all tasks (at least I do) but there’s not a ding for it.”

    In the end, as long as your content is interesting and engaging, it’ll pass the quality rater test, even if you aren’t a novelist.

    I wonder if raters only rate pages for the country they reside in. Would our rater be able to answer that?

  • http://www.netmagellan.com/ Ash Nallawalla

    No, raters in the US were recently being hired for UK English and other languages, but they needed to be in the US and allowed to work there.

    I recall Google Australia hiring locals to do this a few years ago and presume they still do, but our minimum wage is higher than in the US.

  • deepakkumar

    Hi Matt Matt McGee

    Great information shared with the google rater. The only thing which still in my mind is what happen if the rater are reponding positively or negatively for the website. Whether it helps in improvement in the ranking factor if the rater repond positive and if the rater is negative for the site, in which way it will effect the website?

  • http://milkmen.com/blog Cody Baird

    Thanks for the very interesting post Matt. I would have liked ask the “rater” if he is ever frustrated with personalized search results while rating. ;) I wonder if his strategy would be to log out of his Google account before work or if he would toggle “Hide Personal Results” for each query. ; )

  • http://www.agence-redaction-web.fr ARW

    TB says “In the end, as long as your content is interesting and engaging, it’ll pass the quality rater test, even if you aren’t a novelist.”

    This is true for Quality Rating, which is all about Google Algorithm. But the average non-robotic user deserves good grammar and spelling. Having poorly written content can really be an alarming sign for some users. If you are not able to publish proper English content about your business, are you even able to do business properly ?

    I like to see it as, who would you rather hire – all things being equal- a contractor who has a tie with grease spots on it, or a contractor with a clean tie (or without a tie) ? Or, of course, a novelist ?

  • http://uk.linkedin.com/in/ingobousa Ingo Bousa

    I used to be a German language Quality Rater for Google in 2006/2007. My first language is German, I moved to the UK the year before and it was via a UK temp agency. I had to pass the initial test and then there was a second, much more comprehensive test real time test that you had to pass in a certain time. I enjoyed working as a Quality Rater but it was also very repetitive and after a while a bit boring. Very unglamorous and from one day to another they sent me an Email stating that the project was over and my services not needed anymore. Sice then I saw now and then Rating Guidlines popping up on the web and they looked like the ones I used except that the newer ones have some added criteria. The main subjects were always relevancy to a query, locality and then all sorts of spam detection. Was it paid well? Not really :D

  • N.O.

    What a really dumb move. Obviously this person is NDA bound, and I bet Google has a way of finding where to serve the lawsuit pretty quickly.

    I find the answer to this question very curious – “ How does your work affect Google’s search results — do they tell you anything about that?”

    Google has already told us exactly out how raters affect Google’s results pretty clearly, with screenshots, in the youtube called “How Google makes improvements to its search algorithm“ on the Google Youtube Channel. How come I already knew this, but a rater didn’t?

    Something’s not adding up with this story.

  • N.O.

    deepakkumar wrote “what happen if the rater are reponding positively or negatively for the website. Whether it helps in improvement in the ranking factor if the rater repond positive and if the rater is negative for the site, in which way it will effect the website?”

    Please, please, watch the youtube video from Google called “How Google makes improvements to its search algorithm ” located on the Google youtube channel. It spells out the role of the rater and what role they have. Google themselves say the raters work in a sandbox to test if algorithm changes are working properly and should go live – thats *it*,

    This article does far more harm than good, Raters are no different than regression testers who check for errors in the system after a change. Google outright tells us that in the youtube about raters – I don’t get why this website continues to push that there’s some deeper meaning or secret?

  • http://uk.linkedin.com/in/ingobousa Ingo Bousa

    “One thing I think the SEO community is missing is that this program has nothing to do with SEO or rankings. What this program does is help Google refine their algorithm. For example, the Side-by-Side tasks show the results as they are next to the results with the new algorithm change in them. Google doesn’t hire these raters to rate the web; they hire them to rate how they are doing in matching users queries with the best source of information.”

    ..probably the most important bit of this interview.

  • unavailable

    >> I get paid $14.50/hour

    I worked as a Quality Rater in 2006/7 and they paid $18/hr. As I was a SEO guy in those days, this job was heaven-sent. There were super smart people both on my level and above. The SEO lesson I learned, all in all, was “Think about your visitors first, your content second, and forget about Google.”

  • http://wpsites.net braddalton

    This is a great post but these people seriously need to do a hell of a lot more work.

    How many times have you come across forum questions containing exact match and broad match keywords in the top 3 of Google SERPS without an answer?

    Google’s qaulity is based on domain authority not relevance of content but relevance of title. They still haven’t worked out how to index page/post content properly.

    Its a joke for a multi billion dollar company

    Do you think Bing is superior?

  • http://twitter.com/MichelleMitzel Michelle Mitzel

    Maybe you should have asked this person about his or her nondisclosure contract. Anyone who is willing to discuss and have images posted in regard to private company information when he or she contractually agreed not to, gets no respect from me. This person portrays him or herself as someone with more knowledge than they can actually prove. Much what they speak is conjecture, not fact. Due to untrustworthy employees (such as the one from the article), these companies can’t trust to offer more than a need-to-know basis. 

    What I’d like to ask your unnamed “informant” is, what if the company you work for decided to take all the personal information from your application, 1099, and resume, etc., and posted it on the Internet for anyone to see? How would you feel about that? Would you feel a bit violated if tables were turned and your private information was freely distributed? 

  • gbfhsghtr

    You can find many Burberry Outlet Store Online now with the google website search ,so that you can buy some Cheap Burberry Bags directly from the website,you needn’t go aboard now if burberry store are not exist in your local city.And 
    the Burberry Bags On Sale with a incredible price ,you should be happy with them.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide