Five Steps To Clean Up Your Links Like A Techie

You’ve been in business for many years. You may have done it all on your own, or you may have enlisted the aid of unscrupulous SEO “services” to built links for you. Either way, you may be wondering whether you need to conduct a link clean up. This column contains a five-step process outlining how to determine if you have a problem, how to analyze it, and then how to clean it up.

But first:

Unnatural Links Warning = Problem

If you get this message, you have a problem. Skip to step two.

An Unnatural Link Warning from Google Webmaster Tools

An Unnatural Link Warning from Google Webmaster Tools

Step One: Check With Google

Obviously, you can’t email Google and ask them if you have some lousy links. But you can get a clue from the Google Webmaster Tools link list. To access it in the new navigation, go here:

Webmaster Tools Links List - New Navigation

Webmaster Tools Links List – New Navigation

Before you download anything, look at the numbers:

google webmaster tools links to your site

Sample of “Links to Your Site”

Compared to the total of 199k links, 76k is a lot of the share. If you have a distribution like this, you may have some cleanup to do.

Also on this screen, look over the list of “Your most linked content.” If a large proportion of the links are pointing directly to your homepage, you may have a problem.

Finally, click on the More >> button below “Your most linked content.” If your source domains seem low for the number of links to a particular content, you probably have what are referred to as “run of site” links. Those are often the source of penalties because they don’t really add any value, and are probably in either an ad, a blogroll, a list of links, or the footer.

Look at how many links you have by domain to spot run-of-site links.

Look at how many links you have by domain to spot run-of-site links.

Notice that I use amusing words like “may” and “might” here. That’s because there are no hard and fast rules dictating which links are “bad” and which aren’t; most of it is a matter of judgment. While the one circled above does look suspicious, what you’d eventually discover (later in the process) is that it is an affiliate link formatted as a 302 redirect — therefore, it does not pass PageRank.

Step Two: Gather Information

Okay, let’s assume you’ve determined that you most likely have a problem, or you’ve received a message that says you do. Where do you begin? The answer, from Google’s own blog, is to start with the link lists in Google Webmaster Tools. Begin by heading to your list of “Who links the most” in Webmaster Tools and downloading the following two reports:

all-domains-download

Combine them in Excel, sort by Col A ascending and de-duplicate them (note that you’ll have to uncheck “First discovered” since that is only available on one of the reports):

remove-duplicates

Now that you have a deduplicated list, there are two things you should do:

  1. Run all of the links through a header checker, like Xenu’s Link Sleuth or Screaming Frog. Filter out all of the non 200 or 301 responses and put them in another list. Don’t delete this — you’ll need it later.
  2. If you have the means, run the list through a script that checks for “nofollow” code on the link or in the meta robots tag. Put these in another list as well.

Now, you should be left with a smaller list that includes followed links with 200 and 301 status codes. These you will have to check manually.

Step Three: Check Links Manually

  1. Load the page in a Web browser:JLH_Marketing_Box
    • If the page does not load (i.e., you get a 404/Not Found error, note it and move on). Double check that the header checker you used works properly, but know that sometimes it is a few days to a week between when you collect the lists and run them through the header checker and when you start manually reviewing links. More links may become obsolete during this time.
    • If the page does not load any content, but you do not get an error, you still need to check it. Sometimes links are hidden with same color text, inside invisible frames, or placed off the page via CSS.
  2. View the source of the page. On most browsers you can right click or type CTRL+U.
  3. Do a find/search for your domain name. Leave off the www in case the link is without the www.
    • If it is not found, mark the link as “Link Removed.”
    • If it is found, check to see if it’s a decent link.
  4. First, search the page for “nofollow.”
    • If it is returned inside the <head> tag, you can mark the link as “No Follow” and move on.
    • If it is returned in the body tag, check to see if it is in the same href tag as one of your links.
      • If it is, mark the link “No Follow” and move on.
  5. If the link is on the page, and it is not nofollowed, note where the link appears and look at it in the HTML page.
  6. If any of the following are true, mark it disavow.
    • The link is with a collection of unrelated links.
    • The link is on the right or left sidebar or the footer (not in the main content).
    • The link is in the comments.
    • The link does not appear on the page (means it’s hidden).
    • The page has a link anywhere to “submit a link” or “submit an article” or something similar.
    • The page looks like gibberish, spam, or like it was created for the sole purpose of SEO (it might use keywords really heavily, it might list a site’s PR, or say it is SEO friendly).
  7. Finally, if the page passes all of these tests, do the “smell” test.
    • Does the link add value to the site’s visitors? It’s probably ok.
    • Does the link seem like it was shoehorned into an article about something else? It’s probably not ok.
    • Does the link seem like it was included due to a paid arrangement? It’s probably not ok.
  8. Mark the link Disavow or Ok, provide a reason based on one of the above, and move on. Don’t skip the reason; you may find yourself doing this a second and even third time after Google’s response, so you don’t want to have to re-check anything!

Step Four: Domains Or Links?

Now for the last step, and this one gets confusing. You’ve collected and checked a list of all the links that Google reported in those downloads, but you’re not done. You still have two things to do:

  1. Download  the list  of all domains: all-domains-download2
  2. Check these domains against your existing list.

When you download all the domains, you’ll notice that you only get a list of base URLs, like website.com. To check these against the list you’ve already made, add a new column in your spreadsheet labeled “Base Domain.”

Open up a new worksheet (this is important) and copy the list of links into it. Select [Data], [Text to Columns], [Delimited] and then make the delimiter a [/]. This will leave you with a list of base domains that you just need to clean up, possibly find and replace www. and then paste back into the “base domain” column. As long as you don’t sort or delete anything, the list will match up exactly with your list of links.

How to Delimit by / in Excel

Crash Course on Delimiting

Now, go to the very last record in your list of links. Underneath it, paste the list of domains from Google.

Sort by Col A descending, remove duplicates, and you’ll be left with a list of domains that aren’t already represented in your list of links. Check these the same way you did the links in Steps Two and Three.

Step Five: Clean Up & Take Stock

When you finish this process, you should have four main lists:

  1. Link Removed: The link was either removed from the site and the site is still around, or the page the link was on was removed, but the domain is still working. If you’ve contacted someone and successfully had them remove a link, you should list it here.
  2. Domain Removed: The domain either doesn’t exist anymore or it has nothing on it.
  3. No Follow: The link or the page the link was on has been nofollowed, or there is a 302 redirect between the link and your website.
  4. Disavow: These are the links you were not able to remove, but don’t want counting against you.

There are many services out there that will help you get links removed, which is what Google says they want you to do. But in most cases, reaching out to webmasters and asking them to remove the link is a fool’s errand. It’s extremely time-consuming and often unsuccessful — most sites where a webmaster would actually respond to you are ones that you want to keep your link on, maybe just have them add a nofollow.

If you know of directory submissions you can remove or paid links you can stop paying for or add nofollows to, you should absolutely do that. But, most webmasters don’t have that option. In addition, most of those services make you pay by the link, so going through this effort first will save you money based on the number of links that need to be checked.

Next time, I’ll show you how to properly format a reconsideration request and a disavow report. There’s bound to be more qualified link experts who take a different approach, but this is how a self-proclaimed techie approaches the problem. Best of luck, and leave your ideas and feedback in the comments!

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Advanced | All Things SEO Column | Channel: SEO | Google: Webmaster Central | How To | How To: Links | How To: SEO | SEO - Search Engine Optimization | SEO: Spamming

Sponsored


About The Author: is the President of an online marketing consulting company offering SEO, PPC, and Web Design services. She's been in search since 2000 and focuses on long term strategies, intuitive user experience and successful customer acquisition. She occasionally offers her personal insights on her blog, JLH Marketing.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.brickmarketing.com/ Nick Stamoulis

    I also used Moz’s Open Site Explorer because that tells you in the download file if a link is follow versus nonfollow. You can delete all the nofollow links right from the get go and save yourself an immense amount of time. Work smarter, not harder! Dealing with a link warning is stressful enough.

  • http://www.archology.com/ Jenny Halasz

    I couldn’t agree more, Nick! We have a tool we developed that does this (also checks for meta robots=nofollow), but Moz is a good paid alternative.

  • Jonny Lis

    Hi Jenny,

    I like the article and it’s really well written, however there’s 1 or 2 things I thought I’d comment on. Firstly I think what’s more important in determining whether or not your site needs to have a link clean up (other than when a penalty is in place) is the quality of links themselves rather than having a high number of links from few domains (although this should also be addressed). You could potentially invest in link software like Link Research Tools to give an indication of your overall back-link health as well as checking the links manually.

    I’ve also found that disavowing links and submitting a reconsideration request is unlikely to overturn a links penalty alone, especially if the site has a high number of links (but feel free to prove me wrong if you have had success through that method!). I read elsewhere that Google wants to see you have made a ‘significant effort’ in removing harmful links to your site, so I understand that to mean that you should contact the harmful links, which isn’t necessarily time consuming if you create a template email and use a mass-mail sender.

    If you’re interested, I wrote a blog post on my method for removing links penalties here – http://www.smart-traffic.co.uk/seo-blog/shazzam-how-to-remove-a-google-unnatural-links-penalty-first-time-around.htm. This method might be different to your own but I always like to read about how other people go about removing links/penalties anyway.

  • http://www.archology.com/ Jenny Halasz

    Hi Jonny,

    Thanks for your comment. You are right that Google wants you to make an effort to remove bad links. The truth is, though, usually you don’t have the original email address used to procure a directory submission, or the blog comment link is on a blog that hasn’t posted to in over 2 years. Honestly, the chances are good that if you contact someone to remove a link and they actually respond (unless that link was a paid arrangement and you should just be requesting nofollow), then you probably just removed a legitimate link. Just my opinion.

    Using the techniques described above, I have achieved “manual spam action revoked” for several clients.

    But you’re right that I should have mentioned it’s quality over quantity. So while the number of referring domains/links ratio is useful, if you know you’ve been spamming, grab a chair and start cleaning up!

  • http://www.archology.com/ Jenny Halasz

    only using the disavow tool and not removing any of the bad links will not work; this is true. However, there are many steps you can take to clean up as you go, such as “If you’ve contacted someone and successfully had them remove a link, you should list it here.” and “If you know of directory submissions you can remove or paid links you can stop paying for or add nofollows to, you should absolutely do that.” In many cases, we’ve also identified uniquely coded links (such as affiliates) that we can run through a 302. This post is simply meant to help diagnose the problem and go through the manual checking stage. Next month there will be more. :)

  • tim789

    Jenny – Thank you for a very thorough, yet simple guide. Our Google WT shows 17.5k links to one page from 27 source domains. Ouch. problem is, I can’t figure out where they are coming from. Clicking the linked page shows a list of domains, but the most links from any one domain is 75. The total is nowhere near 17.5k. A Google search on that one domain shows 856 links, still far short of the total in Google WT. It’s very frustrating. I’m not sure how to find the actual sources. Any thoughts on this are greatly appreciated!

  • Pete McAllister

    Great article Jenny, link cleanups are hot just now and this guide cuts through the fluff! Absolutely love actionable content like this :).

    I use http://www.urlopener.com to get through browser checking multiple urls a lot more quickly. Just thought I’d share that as it may help out some other readers…

  • http://adsmark.in/ Amrit James

    it will help alot :) thanks

  • Agência MACAN

    Hello Jenny Halasz!

    Firstly congratulations on the post, is very practical.

    I wonder if there is any way to speed this removal of links. We have managed to remove the links of some areas that had a large amount and also insert the file disavow.

    Do you have an estimate of time for them to disappear GWT? We send the files over 15 days and a small amount of them were removed.

    Thank you!

  • http://www.archology.com/ Jenny Halasz

    You should get a response within 24 hours that Google has received your reconsideration request. Then it takes a couple of weeks to a month or more to get an actual response. Sorry, no way to speed the process up. Although it helps to have everything formatted properly, which I’ll discuss in my next post.

  • http://www.archology.com/ Jenny Halasz

    That’s a really neat tool that I didn’t even know existed. Bookmarked – and thanks!

  • http://www.archology.com/ Jenny Halasz

    You’ve hit on the part of GWT that drives webmasters crazy. The numbers don’t match. Don’t focus on the number too much; look for trends and patterns instead. For that domain that shows 856 links, it’s probably a run of site link (look for an ad, a blogroll, a link in the footer, etc) that once removed will remove hundreds of links from your overall profile. That’s a time when it’s absolutely worth it to contact the site owner and ask them to remove your link or add a nofollow to it. Good luck!

  • http://www.archology.com/ Jenny Halasz

    Hi Paul,

    All good notes, thanks. I always try to write my posts from the perspective of the small business owner who usually can’t afford tools (although I personally think certain tools are always worth the investment), but you are absolutely right that screaming frog does all this and more.

    I differ with you on including other data sources. Google says in their disavow how-to (linked in the post above) to use the list from webmaster tools. If you add more data sources, you’re just making more work for yourself, in my opinion.

  • tim789

    Thank you, Jenny. This is very helpful!

  • http://www.archology.com/ Jenny Halasz

    Hi Steven,

    I read your article, thank you for sharing. It seems that the only item in my list that you specifically disagree with is my assertion that trying to contact spammers to have links removed is a fool’s errand. I mentioned in response to one of the other posters – of course you should try to have any links you can removed – or if you can redirect them to another page via a 302 and strip the pagerank off, that’s always good too. We often do that with clients that have affiliate links.

    In terms of suggesting to people that they look at the source code of each link to determine where it is on the page, etc., I’m not clear on your objection. And finally, I take offense to your statement “(trying to pass off as an authority on the subject) who don’t really have the experience or the expertise in the field.” I’ve been in the industry since 1999, and at the risk of sounding cocky, I *am* an authority on the subject.

    I write all of my posts from the perspective of doing as much as you can for free… of course people can hire a consultant or buy a tool. But in this economy, I find that my readers respond better to sweat equity opportunities. If you’d like to discuss better ways to do link auditing without trying to sell your tool, then we’re all ears.

    Thanks for reading,

    Jenny

  • Naveen Kadian

    Hello Jenny,

    Thanks for this great article. First of all i didn’t get any message from
    Google regarding “Unnatural Links Warning” but my ranking is very effected in search
    engine after that i have checked my WMT and checked my all links and made a
    disavow file and submit it to WMT without removing my bad links. But didn’t get
    any benefit till now. Now after read your article i have come to know that i
    have to remove my bad links first, then i can make a disavow file for get back
    my ranking again. I am confused that which links is good and which one is bad
    for my website. I have some directory, Social bookmarking, articles and forums
    links which are links to my website. I have one problem that directory websites
    don’t have any email ID for mail but they have link remove request form. I have
    requested for remove my links on all directories and also removed my bad
    articles links. But i didn’t get any confirmation mail from those directories. Now
    i need your help, can i make a reconsideration request to Google or have to do
    anything more to do this??

    Waiting for positive reply asap.

    Thanks
    Naveen Kadian

  • Manish

    Hi Jenny

    Thanks for the tips, I did exactly what you intended to achieve, but I created a simple code in PHP to do so, I have multiple aged site and had many many links.

    One day when I can clean the code I will may upload it for in OS for others, It will let you add the links in bulk, check duplicate, put related links under the domain, then let you check all the links under each domain, add comments to each link or domain, let you select multiple options like , Good, Disavow,emailed,remove and done. can let you send email out on 1 click, also create the report for disavow. Well … I have multiple sites and keeps my work organised.

    Question I have: I was able to get my website penalty revoke after I submitted almost everything, in the process I have found many Website Analysis tools, Many web directories (paid and unpaid) and many Press Releases .. should these be disavow too, or not.

    I did found heaps of articles, which were removed, anything to do with link exchange was a no no, a lot of article sites shared our articles to other which cannot be removed and tons of Comments and Book Marks. All Removed as much as humanly possible.

    Question : It is almost 2 weeks and I still cannot see my website any where .. how long should I wait to see my website back even on 10th Page

    Also Should I ever clean my disavow list once the penalty has been revoked ?

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide