I am a Wikipedia administrator, and I specialize in complex investigations. When Jonathan Hochman suggested I write an article for Search Engine Land, he mentioned that this publication and its readers regard Wikipedia as a search engine. It probably comes as no surprise that my spine stiffens at that concept, but media professionals and Wikipedia volunteers seldom understand each other. So I’ll illustrate my perspective with an example: let’s have a look at some politicians.
I cite this instance because it was not one of the cases I handled and it has already been in the news. On January 27, 2006 the Lowell Sun ran a story about Wikipedia’s biography of Representative Marty Meehan, reporting that an IP address which originated from the United States House of Representatives congressional offices had erased the congressman’s broken term limit pledge from the article. Here’s the edit, which Wikipedia jargon calls a "page diff".
The story became national news when people uncovered other congressional attempts to spin Wikipedia biographies of sitting legislators. That attention led to a fresh pledge by Rep. Meehan that his staff would stop editing Wikipedia. Several of his embarrassed colleagues issued similar assurances.
Rep. Meehan has since become Chancellor Meehan of the University of Massachusetts, but that IP address continues to be shared by various congressional offices. So we’ll see whether those pledges have stood the test of time. The block history raises my eyebrow.
A few clues and clicks from there I reach the biography of New York Rep. Carolyn McCarthy and locate an edit by that IP address dated 19 June 2007 that deleted a well-referenced but unflattering section describing an MSNBC interview that had exposed her ignorance about an assault weapons ban she had proposed. Other dubious activity from June 2007 includes blanking vandalism to the biographies of two Tennessee state politicians: Matthew Hill and David Davis. Someone who had access to a congressional office computer didn’t want the public to read properly cited information about their ties to the pharmaceutical industry.
I uncovered this information in ten minutes and my sysop tools weren’t necessary for any of the research. You can see for yourself:
None of those inappropriate edits remained in Wikipedia’s live version very long. Site volunteers reverted most of them one minute after implementation; the longest endured for nineteen minutes. What is notable for this discussion is how each action gets logged in site histories where it remains a public and voluntary disclosure. Anyone with an Internet connection can find the rest of that trail; it remains fresh as morning snow. A good follow-up exercise for SEO professionals would be to track the edits that led up to that IP address’s April 2007 blocks for link spamming the congressional black caucus website.
As an administrator I am continually surprised by how often I see edits that serve little purpose except to place the editor at risk for adverse news coverage. Few of the people who have a professional interest in knowing how Wikipedia operates actually possess more than a superficial understanding of its workings. Copyleft licensure, for example, requires that each action contain an authorship notation and ensures the information remains freely reproducible. Activities at Wikipedia are transparent, yet many individuals who have a professional reputation to protect behave as if their actions were guarded by an opacity the site does not possess. Except for article deletions and occasional courtesy blankings, Wikipedia’s archives remain public and accurate back to December 2001.
For each case that actually makes the papers I see perhaps two dozen that look newsworthy. Only a corresponding ignorance among investigative journalists has shielded the rest. The press has a habitual reliance on tips yet they have little inherent need for tips when the story relates to Wikipedia. In my estimate they will soon learn how to research leads for themselves. Wikipedia’s prominence as the Internet’s most popular nonprofit volunteer-driven site usually means a story cascades into international coverage once it breaks.
In January 2007 Microsoft learned a hard lesson in black hat tactics. Blogger Rick Jelliffe reported that Microsoft had offered to pay him to edit Wikipedia’s Open Office XML page on their behalf. The strong appearance of impropriety quickly turned that revelation into international news.
One point that was lost in most news reports was that Microsoft might actually have had a legitimate point about the balance of coverage on Wikipedia’s Open Office XML article. The folks at Microsoft could have improved that page while avoiding a PR debacle. Never mind what the Telegraph story says: I do not issue site bans simply because an editor has a conflict of interest. Nor, to my knowledge, does any Wikipedia administrator. We do ask people such as yourselves to act with discretion. So to address the original matter about Wikipedia being a search engine, site policy explicitly disavows that purpose.
Wikipedia’s Conflict of Interest guideline dovetails with several policies and experienced editors have written essays to provide supportive advice. These ought to be must-read material for any Search Engine Land regular. Rather than regurgitate what they already say I’ll summarize what I consider the most important points and add some suggestions of my own. These tips are not the official position of the Wikimedia Foundation, yet there’s a good chance that business editors who run afoul of policy will encounter either me or someone I’ve trained.
Wikipedia white hat activity in a nutshell: Designate a particular individual to be the Wikipedia liaison. Have that person register an account and declare the conflict of interest on the account’s user page. Then post suggested changes to article talk pages. For a variety of reasons this approach is safer and results in more durable changes than direct editing in conflict of interest situations.
Eight underused Wikipedia white hat strategies
- Provide line citations. This is one point where an SEO professional’s interests often coincide with Wikipedia’s goals: factual verification is important to the project. When used judiciously, citations can be the most durable way to send traffic to your client’s website. Focus on topics where that site is strong on content and compliant with Wikipedia’s Reliable Sources guideline. In many instances a client’s site may be a self-published source, which limits how it can be used as a reference. Sourcing is welcome at articles that are already flagged with requests for citations. In other instances it is better to post preformatted citation suggestions at article talk pages. Supply text that summarizes the referenced content when making a citation, use wikimarkup, and conform to whatever citation format is already in use at the page. Act with care in order to avoid Wikipedia’s spam blacklist or criticism from volunteer editors. Tailor each suggestion to relevant content in the particular article.
- Use edit summaries. These are courtesies to other editors who review your contributions in history files. Edit summaries are also an effective feedback to self-limit against link spam. If you can’t think of anything better to write than "inserting outgoing link", it’s time to rethink your practices.
- Seek mentorship. Wikipedia’s Adopt-a-user mentorship program helps new users adjust to site standards. This can be particularly useful because some people come to the site with misconceptions drawn from inaccuracies in mainstream press reports about Wikipedia. Experienced editors and administrators interpret formal mentorship as a positive sign.
- Get to know the "what links here" tool. Each Wikipedia page has an index that controls the internal incoming traffic. Wherever a Wikipedia article has a durable outgoing link to your website, examine its incoming links list for any obvious omissions, then run a text search on those omitted pages. If the title of the other article already exists in unlinked form, go ahead and link it. Otherwise compose sample linking text and propose it at that article’s talk page.
- Utilize article categories. Another route that drives internal Wikipedia traffic is its category system. For instance, Wikipedia’s "Search engine optimization" article has four categories: "Internet advertising and promotion", "Internet terminology", "Search engine optimization", and "Internet marketing by method". Learn the category structure and add categories as appropriate at articles that link to your client’s site. Categories and wikilinks are also good ways to locate unreferenced article passages where your client’s site might become a citation.
- Develop a watchlist. This tool provides swift notification of talk page replies and article changes. Use it to keep in touch with other editors. Watchlists also alert you to obvious vandalism and a track record of reverting vandalism helps you earn the respect of other Wikipedians.
- Contact WikiProjects. WikiProjects are coordinating centers where Wikipedians with similar interests plan and prioritize article improvements. When you propose a major edit to an article talk page, a polite query for input at the relevant WikiProject can generate help and feedback.
- Accept feedback. If your links are getting reverted and experienced editors are making complaints then you’ve probably misunderstood site standards. Pay attention to what they tell you, ask questions, and adjust your approach.
Although Wikipedians are understandably skeptical about conflict of interest editing, SEO professionals who respect the site as an encyclopedia rather than as a quirky search engine can earn acceptance from its volunteers. Look for approaches that reconcile your goal of sending traffic to websites with Wikipedia’s goal of being an informative and reliable first stop for research.
Durova is a Wikipedia administrator who confronts some of the site’s most disruptive editors. She uses a pen name to avoid harassment in real life. After graduating Columbia College, Durova attended film school and also served in the US Navy.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.