Google has rolled out new tools to help people quickly get content removed from its search engine. Those targeted at site owners allow for speedy removal of pages and cached copies of pages. Other tools allow those to request the removal of images or links to pages with personal information about themselves, in the right circumstances. More on the tools and various options are covered below.
Site Owner Removal Options
For site owners, the best way to keep content out of Google is by using the robots.txt or meta robots tag options. Either option can prevent pages from getting into Google or get them removed once included. However, getting pages removed once in can take time. You have to wait for Google to revisit the pages you’ve flagged for removal, a process that can take days or longer.
The new site owner tools can be found within Google Webmaster Central, for those with verified accounts (That’s explained more here, and it’s free and easy to do). Once logged in, select the site you want to remove pages from via the "Dashboard" screen. When that site loads, choose the "Diagnostics" tab, then select the "URL Removals" link you’ll see in the left-hand navigation.
That will load a screen up with four options, allowing you to remove:
- Individual URLs (a particular page, image or anything with a
specific URL that’s listed in Google)
- Directories: (all pages within particular sections of your site,
such as within the /about/ area)
- Entire Site: (want to wipe out your entire site? Go ahead!)
- Cached Copy: (want a page to be listed, but not have a copy of it cached anymore?)
To remove individual URL, directories or your entire site with the new tools, you must block crawling of these using either the robots.txt or meta robots tag options. Alternatively, if the page, pages or entire site are physically gone from the internet — returning 404 "not found" or 410 "gone" error codes – then the tools can also process the request.
To remove a URL, you enter that URL. Up to 100 can be entered at a time using the form (if you want do more than this, submit the first 100, then start again with a fresh form). To delete directories or entire web sites, you enter the directory path or the web site address using separate forms.
After submitting a request, the deletion will go into a processing queue. You can monitor the status of any request using the "Current Requests" tab of the URL Removals screen. Requests in progress are flagged as "Pending." Those removed get flagged "Removed" and appear on the "Removed Content" tab. If there’s a problem, a "Denied" message appears, with a link to explain more about what problem needs to be corrected.
How long to process a request? The tool should act on any valid requests within 3 to 5 days or faster.
How long will removals last? For six months, once processed — and regardless of whatever you do on your web site during that time, unless you specifically ask for reinclusion.
For example, say you remove a page from your web site, then ask for the page to be removed from Google using the removal tool. Two weeks later, you put the page back up. Google will still continue to follow the original instructions, not to include the page, even though it exists.
During the six month period, you can rescind a removal request. Simply find any removal action you’ve done listed on the Removed Content tab, then select the "Reinclude" option that should show.
After the six month period, Google will resume including or excluding content as normal — IE, looking to see if you have a robots.txt or meta robots tag barrier in place, to prevent valid pages from getting in. If you want pages kept permanently out, don’t put them back online without the proper restrictions in place!
Removing Cached Pages
By default, Google listings have a link to the actual web page as well as a cached copy of the page. Cached pages are where Google will show a searcher a copy of the page that Google saw without the searcher having to go to the actual web site. This is handy for searchers in cases where a page might no longer exist. However, site owners might not want these cached copies to exist at Google.
The meta robots tag provides options to keep cached pages out, but the new tools give you speedier access for removal. As with removing URLs, the tools at Webmaster Central will get rid of a cached copy within 3 to 5 days.
To process your request, Google needs to see that a meta robots tag set to "noarchive" is now on the page (see Meta Robots Tag 101: Blocking Spiders, Cached Pages & More for more about this). Put that tag on the page, push submit, and you’re set. Well, you will be set from around 3pm Pacific time from April 18 onward. There’s a bug still being worked out for this part of the new toolset.
What if you can’t put the tag on a page? I’ll explain more how this works in the third-party section below.
The cached page will be kept out for six months. You can ask for the cached copy to be reincluded sooner than this, if you want. However, make sure Google has actually revisited the page since you altered it. Unfortunately, this means watching your logs. To be safe, you’re probably better off not asking for the reinclusion before the six months have expired.
Want to keep the page or any pages from being cached permanently? Again, use the meta robots tag.
Finally, keep in mind that using the removed cached pages option will also remove any description of the page in the listings. In contrast, the meta robots tag gives you the ability to remove just the cached page OR the description OR both, if you choose.
URL Removals Options: At-A-Glance
I’ve written earlier about a similar Yahoo tool for removing URLs (Up Close With Yahoo’s New Delete URL Feature) plus options with all the major search engines to remove page descriptions and cached copies (Meta Robots Tag 101: Blocking Spiders, Cached Pages & More). Below is an at-a-glance chart I’ve used with both those previous articles, now updated to add in the Google options.
|Google Delete URL|
|Stops Index Inclusion||Yes||Yes||Yes||Yes|
|Stops Link Only Listing||No||No
|Why Use?||Easy to block many pages at once||Can’t access root domain||Don’t even want URL to appear or need page out fast||Don’t even want URL to appear or need page out fast|
- Stops Crawling: If "Yes," the page won’t be spidered at all. If
"No," the page might get spidered, but it will not be included in listings.
- Stops Index Inclusion: URLs will not show up in response to
- Stops Link Only Listings: This is where a page is listed with only a title and URL. Yahoo calls these "thin" listings; Google calls them "partially indexed".
Third Party Removal Options
What can you remove? Not a lot using the new tool, if you haven’t worked with the site owner themselves.
Third Party Page Removal
Let’s say there’s some page (or image) you don’t like on a web site. You’ve contacted the site owner, and they’ve agreed to pull down the content. Unfortunately, you still see it showing up in Google’s listings. Ideally, the site owner could log into Google Webmaster Central, use the site owner tools I’ve covered, and get the page removed. But they don’t want to do this.
The third party tool lets you do it for them, or for any page that’s no longer live on the web or now banned from crawling using robots.txt or the meta robots tag. You simply enter the URL of the page in question and submit. If it’s a valid request (again, the page is no longer live or being blocked from crawling), it will be removed in 3 to 5 days. You can also log in to see the status of your request.
Site owners — don’t freak out over this! Someone can’t remove your pages from Google unless you actually take them off the web or prevent blocking. This simply speeds up the removal process.
In fact, the ability for a third party to trigger a change isn’t new. Google’s long had an automatic URL removal tool that anyone could use to trigger page removals. In fact, when WebmasterWorld blocked spiders from hitting the site back in November 2005, several people used that tool to get pages removed faster than Google would have done following its usual schedule.
That tool remains for the moment, but Google says to use the new tools for faster processing and better reporting.
Third Party Cache Removal
What if the page remains but just part of it has changed — and you want Google’s cached copy to reflect this? There’s an option for that, too.
For example, say you’re Joe of Joe’s Diner. Someone reviewed your fine eatery and wrote a three word review: "Joe’s Diner Sucks." This review upsets you. You contact the site owner, and they agree to remove it from the page. Unfortunately, it can still be seen by anyone who looks at Google’s cached copy. You have to wait until whenever Google gets around to refreshing its copy of the page for that review to go away (which could take awhile, see Squeezing The Search Loaf: Finding Search Engine Freshness & Crawl Dates for more on that).
As explained, the site owner could help by getting the cached page removed entirely. But if the site owner doesn’t want to do that, you can use the third-party tool to make it happen.
First, check to see if at least the site owner has at least put on the required meta robots tag to prevent caching. If so, submit the page, and the request will be processed.
No tag? Here’s the alternative. Submit the URL, then find some of the words that have been removed (such as "diner sucks"). Enter these words into the "Term(s) that have been removed from the page" box of the Cached page removal form. Submit, and Google checks the page. It sees the words are gone, it knows the page has changed and processes the request to remove the cached copy.
Site owners — DO feel free to freak out over this! You should.
To be clear, anyone can wipe out your cached pages in Google for up to six months using this third party tool even if you have NOT yourself used the required meta tag.
No big deal? Who cares about cached pages being gone? Remember, it’s not just that your cached page will go. The description for your listing will disappear, too.
Frankly, I don’t think Google should have launched this feature this way. I think it is ripe for abuse.
For its part, Google says the old tool actually operated the same way for years and has never been abused. IE, the feature isn’t new, it’s just getting new attention as part of the new toolsets. Google says it will watch more closely to prevent abuse such as I’ve outlined. The company’s official statement:
Google’s always encouraged consumers to work directly with a site’s webmaster when they have concerns about content in our search results. When the webmaster has removed or changed information on the live page, but that information still exists in our cached copy, we’ve worked with consumers to review and help expedite the removal of outdated cached copies appropriately. Where consumers previously reported these outdated cached copies via online contact forms, they can now do so via the tool. The same precautions and considerations are still observed; with the launch of the new tool, the means by which a consumer can report an outdated cached copy has changed. We’ll monitor requests through the new tool and make adjustments as necessary.
Personal Info Removal
What if the site owner won’t remove or modify a page. Then you can get the page removed if shows or contains:
- Your social security or government ID number
- Your bank account or credit card number
- An image of your signature
- Explicit content which violates Google’s guidelines and contains my personal information.
What’s that last one all about? Well, say someone scrapes your name or business name and shoves it onto a page of porn content. The porn’s the "explicit" part and your personal information — well, that’s your name, Google says. Google will act to get rid of that page, and no one will be the sadder for it.
That third party removals tool also provides options for anyone to delete dead links in Google or report pages or images that have slipped past the SafeSearch adult-content filter.
In addition to that, this page at Google lists other types of content removals you can request, such as listings in Google Blog Search or transcoded pages in Google Mobile. Google’s DMCA page also covers how to remove content that might be violating your copyright.