A Comprehensive Look At Digg’s Recommendation Engine
It’s hard to believe, but Digg‘s latest feature, the recommendation engine, is one that has been in the works for over two years. Before they developed images functionality, launched visualization tools, released Google, Netvibes, and Myspace widgets, developed Facebook functionality, rolled out dataportability enhancements, released updates to the comment system, updates to digg the candidates, and added several other features or enhancements to the social news site, the team at Digg already had the idea for a recommendation engine. They were looking at the site as a platform that monitors how you interact with it, and improves your experience based on what it learns about you. In fact, they took so long that some people even took matters into their own hands.
“Digg is also learning a lot about what its users are into,” Rose said, “so it should be able to recommend stories based on what you’ve been digging and allow you to communicate with other people who have similar interests.” – BusinessWeek, March 27, 2006
It should come as no surprise that when Digg finally released the new feature, over 2 years later, the expectations were incredibly high. After all, sites like StumbleUpon and Reddit have had similar recommendation engines for quite a while now and are already innovating beyond that. My initial opinion of the recommendation engine is that it’s decent but with some very obvious limitations. While we should all be glad the feature has been released, it’s very much a version 0.5 beta release. Keep that in mind as you use it because the engine’s performance can vary quite a bit from day to day.
How Does it Work?
A lot of people have been speculating as to how the recommendation engine actually works. To answer that in more detail, you should read Digg lead scientist Anton’s Whitepaper on the subject or my exclusive guide and review. Here’s a brief overview.
Whenever you Digg a story, the recommendation engine records two things about the action. First, that you liked that story, and second, every user that Dugg the story before you (this includes the submitter). This signals to the recommendation engine that these users like the same content as you, and sometimes they find it before you, so it uses those parameters to recommend to you stories they Digg or submit.
The recommendation engine also records your history over the last 30 days to correlate your Digging habits with other users who Digg the same stories as you. The compatibility percentage tells you how much your Diggs match with this user, and based on the percentage, if you’re not already friends, you can add and follow each other.
How it Doesn’t Work
In theory, the system mentioned above can work fairly well. Under the right conditions of healthy diversity of opinion, independence between individual members, and decentralization in the community, this recommendation algorithm would work great. Doesn’t sound like the Digg community, does it? The Digg community is a very homogeneous and imitative community, and for a community like that the rules the recommendation engine uses have one severe impact: they ensure that a core power group of users is much more visible to the community than the average person and that their content gets promoted over many other users.
This isn’t necessarily a bad thing, but it certainly can be a bad thing.
First, why it may not be a bad thing. What happens with the recommendation engine is that users who have been participating on the site for a long time (and because of their dedication to the community and the passion with which they participate, they’ve developed a large following on the site) get a lot more influential. For example, if I’m a popular user on the site and have a following of 100 people, with the recommendation engine, every time one of those hundred Diggs one of my stories, not only does that ensure that my stories are recommended more often to that user (increasing our Digg correlation coefficient), but the same goes for every user that follows that user (my followers’ followers). By the end of it, someone with a following of 100 now has a following of 150, and not only is the following growing in absolute numbers but it also grows in diversity each time (because the correlation is never 100%). This may not necessarily be a bad thing since these users are popular because they submit good content, and now more good content will be submited and promoted to the front page.
At the same time, this can be bad for the Digg community over time because if the same users get promoted over and over again, it creates an even more homogenized community and you will see similar memes from similar sites, on similar topics, pop up again and again. This makes Digg a community where people go not to learn or to discover, but to reaffirm their beliefs regardless of right or wrong.
The problem with it isn’t that it doesn’t fulfill its promise of recommending content relevant to you. The problem is that it recommends relevant content on a basis that isn’t as efficient or equitable as it should be and may not necessarily help Digg get its edge back (for that we might need to look towards a freshness algorithm). Apart from recommending stories based on who you Digg and who they Digg, the stories need to take into account what you Digg and where you Digg from (content, categories, hopefully tags, and sources — we need to get contextual). The recommendation engine was supposed to help you discover hidden gems, not show you stories from friends you usually Digg. Those are stories you would come across anyway. The methodology of the recommendation engine algorithm is also counterproductive to Digg’s agenda of emasculating this core userbase.
- Top 100 Digg Users Control 56% of Digg’s HomePage Content
- The Power of Digg Top Users (One Year Later)
- Analysis: Top 100 Digg Users
- Today: 31% of Digg Homepage submitted by 10 Users
Furthermore, the other limitation of the system, and this one is intentional and by design, is that the “Diggers Like You” section doesn’t allow you to expand the list to more people. The system, while matching you with other users, automatically decides on a cutoff point for the correlation, and shows you only people who make the cut. I would like to meet a Digger that has an only 25% overlap with me. Maybe there are other stories I might like but am not seeing.
Finally, from a design perspective, people should be able to set what upcoming queue opens for them by default and how many stories are listed in their ‘Recommendations in Upcoming’ widget.
The Unexpected Consequences
The recommendation engine has many unexpected and unanticipated effects on everyone participating in the process, those creating the content, those submitting it, and those reading it.
Partly because of the hype around the new feature, partly because of the shiny red beta tag that people see in their profiles, and partly because the feature works better than any other feature on Digg for finding stories you would like, the feature has caused a big spike in activity on Digg. Core users are submitting a much higher volume and voting on an equally higher number of stories. At the same time, Digg is not throttling how many stories can be promoted to the front page on a day, so this increased activity means many more stories are being promoted. I have also seen many instances of stories being promoted in 6 hours or less (when they get recommended to a lot of people and get a burst of instant Diggs), and I have seen some stories get removed from the upcoming queue but still get promoted much after the 24-hour mark. There have also been stories with an astronomical number of Diggs that never got promoted. Also, because so many more stories are being promoted, it means that each story gets even less time on the front page than before, hence, less number of Diggs per story, less comments per story, less outbound traffic per story, and for those interested, this also dampens the linkbuilding effect that Digg is known for having.
How to Make the Most of it
Making the most of the recommendation engine is very easy. All you have to do is submit content you like and vote on content you like. The system will do the rest. The best part is the more you use it, the better it becomes for stories that are recommended to you, and who your stories are recommended to.
What it Means for Digg and for Advertisers
To truly understand the impact of the Digg recommendation engine you have to compare it to Facebook’s much-maligned Beacon project. The recommendation engine is something that, over time, will record exactly how you interact with Digg’s platform, and know what stories/sources/links you like, what top-level categories and topics you like, and if they are smart, exactly what content you’re consuming (e.g., contextually determining what music you like, what movies you’re anticipating, and what new Apple products are on your wishlist). The beta only records some very basic information (that’s why it’s a beta), but it is expected to become much more complex and robust as time goes on. This means better recommendations for you, but it also means that since Digg has a more comprehensive picture of who you really are, they can sell advertising that is better for you and for the advertiser, and for Digg’s pocketbook.
What This Means for the Future
The future of the recommendation system isn’t much different than what Facebook plans on accomplishing with Beacon. Beacon has a head start and already records offsite activities such as “…purchasing a product, signing up for a service, adding an item to a wish list, and more.” Digg’s ultimate goal is to be the decision market for all sorts of media (text content, music, video, podcasts) but also food/bars/clubs, shopping and entertainment, and more, and the recommendation engine is one more step toward that goal.
Note: According to Kevin Rose’s TWiT appearance last week, activity is up as much as 40% on some areas on the site.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.