Google Uses Amazon’s Algorithm For YouTube’s Recommendation Engine

Greg Linden reports that Google has switched the algorithm they use for YouTube’s recommendation engine from their own to a variation of Amazon’s algorithm that was designed in the late 90s.

This is an interesting move being that Google has the man power and smarts to build a fairly good recommendation on their own. But here they opt to use an algorithm designed by Amazon in 1998? Of course, the best algorithms are enhancements on top of previous algorithms. Google’s own algorithm is light-years beyond where they first were with their original PageRank patent.

Here is a relevant snippet from the Google’s RecSys 2010 paper:

Recommending interesting and personally relevant videos to [YouTube] users [is] a unique challenge: Videos as they are uploaded by users often have no or very poor metadata. The video corpus size is roughly on the same order of magnitude as the number of active users. Furthermore, videos on YouTube are mostly short form (under 10 minutes in length). User interactions are thus relatively short and noisy … [unlike] Netflix or Amazon where renting a movie or purchasing an item are very clear declarations of intent. In addition, many of the interesting videos on YouTube have a short life cycle going from upload to viral in the order of days requiring constant freshness of recommendation.

To compute personalized recommendations we combine the related videos association rules with a user’s personal activity on the site: This can include both videos that were watched (potentially beyond a certain threshold), as well as videos that were explicitly favorited, “liked”, rated, or added to playlists … Recommendations … [are the] related videos … for each video … [the user has watched or liked after they are] ranked by … video quality … user’s unique taste and preferences … [and filtered] to further increase diversity.

To evaluate recommendation quality we use a combination of different metrics. The primary metrics we consider include click through rate (CTR), long CTR (only counting clicks that led to watches of a substantial fraction of the video), session length, time until first long watch, and recommendation coverage (the fraction of logged in users with recommendations). We use these metrics to both track performance of the system at an ongoing basis as well as for evaluating system changes on live traffic.

Recommendations account for about 60% of all video clicks from the home page … Co-visitation based recommendation performs at 207% of the baseline Most Viewed page … [and more than 207% better than] Top Favorited and Top Rated [videos].

Related Stories:

Related Topics: Channel: Video | Google: YouTube & Video


About The Author: is Search Engine Land's News Editor and owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry's personal blog is named Cartoon Barry and he can be followed on Twitter here. For more background information on Barry, see his full bio over here.

Connect with the author via: Email | Twitter | Google+ | LinkedIn


Get all the top search stories emailed daily!  


Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • Matt Cutts

    Just in case this story is getting attention because of yesterday’s news about Microsoft: if you read the comments on that blog post, one person showed up to say “Amazon wasn’t the first to use item-item collaborative filtering, right? I remember seeing this at Firefly in 1996.” Greg says that Firefly used a different collaborative algorithm, and the original commenter said “Firefly definitely had item-item similarities.”

    Personally, I see a big difference between trying different algorithmic ways to solve a problem vs. sending Google clicks directly to Microsoft, and Microsoft then using that data in their ranking.

  • Michael Martinez

    “Personally, I see a big difference between trying different algorithmic ways to solve a problem vs. sending Google clicks directly to Microsoft, and Microsoft then using that data in their ranking.”

    THOSE CLICKS DO NOT BELONG TO GOOGLE. They belong to the users who execute them, and those users are free to share those clicks with whomever they please.

  • Kevin Pike

    Surprised to see only two comments on this story – while Danny is getting beat up for his “BING cheats” story.

    @MattCutts Not sure I follow the “clicks thing”. Couldn’t BING just scrape Google results without clicks?

  • Barry Schwartz

    Matt, I did not cover this story as a Bing vs Google follow up from yesterday. I just found it interesting that Google didn’t build this from the ground up and used a deviation of Amazon’s algo.

    Again, I don’t think I implied this had anything to do with Bing vs. Google.

    Sorry if you read it that way, but I understand why.

  • Dave_Lawlor

    What I seem to see is Matt seems more concerned with PR than SPAM results lately, Perhaps he should change departments. Way to be proactive on SOMETHING Matt! But hey if I Google 100 terms and only 9 (lol yeah we know it would be way higher) of them return SPAM terms in top 5 I can claim Google has lost the battle against SPAM right? Because 10% is the bar that has been set by Matt and Big G for proving something absolute. Bonus points if more than 10% of those SPAM sites are being monetized with Google Adwords!

    Worry about your results first Google before throwing stones at the competitors.

  • Adrac Ltd

    I agree with Matt on “sending Google clicks directly to Microsoft, and Microsoft then using that data in their ranking”.

    Microsoft have everyright to collect data from the users who use IE browser when the users browse websites directly by typing the URL.

    However, I do not understand why they would just blindly pick up search results from Google as part of their algorithm when ranking websites. The examples shown by Google is a true indication that the search results from Google have a high authority in Bing’s search results.

    Google itslef is currently having problems with their search quality and hope they’re working on it to produce much better search results. So why even bother looking at Google results and not work on improving their own search algorithm.

    Bing truely has some cool features which is well derserved; however the searh algorithm still needs improving to drive better results.

  • Arnie K

    I find this a bit interesting. My partners and I were awarded a patent for a recommendation platform that is based on item to item correlations. We were issued the patent in 2000. Long story behind it, but yes we still own it.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest


Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States


Australia & China

Learn more about: SMX | MarTech

Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!



Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide