Wayback Machine Adds 160 Billion Indexed Pages In A Year, Surpasses 400 Billion Indexed Pages

wayback_logoThe Internet Archive announced that the Wayback Machine, a huge internet archive of web pages dating back to 1996, has surpassed 400 billion pages indexed.

In January 2013, a little over a year ago, the Wayback Machine said they had 240 billion URLs indexed and since then, they have added another 160 billion URLs! That brings up the indexed page count by the Wayback Machine to over 400 billion URLs.

On Friday, the Internet Archive announced this on their blog and said the indexed pages date back from late 1996 up until a few hours ago. Then they shared some of their history:

  • 2001 – The Wayback Machine is launched. Woo hoo.
  • 2006 – Archive-It is launched, allowing libraries that subscribe to the service to create curated collections of valuable web content.
  • March 25, 2009 – The Internet Archive and Sun Microsystems launch a new datacenter that stores the whole web archive and serves the Wayback Machine. This 3 Petabyte data center handled 500 requests per second from its home in a shipping container.
  • June 15th, 2011 – The HTTP Archive becomes part of the Internet Archive, adding data about the performance of websites to our collection of web site content.
  • May 28, 2012 – The Wayback Machine is available in China again, after being blocked for a few years without notice.
  • October 26, 2012 – Internet Archive makes 80 terabytes of archived web crawl data from 2011 available for researchers, to explore how others might be able to interact with or learn from this content.
  • October 2013 – New features for the Wayback Machine are launched, including the ability to see newly crawled content an hour after we get it, a “Save Page” feature so that anyone can archive a page on demand, and an effort to fix broken links on the web starting with WordPress.com and Wikipedia.org.
  • Also in October 2013 – The Wayback Machine provides access to important Federal Government sites that go dark during the Federal Government Shutdown.

Related Topics: Channel: Consumer | Internet Archive | Search Engines: Academic Search Engines

Sponsored


About The Author: is Search Engine Land's News Editor and owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry's personal blog is named Cartoon Barry and he can be followed on Twitter here. For more background information on Barry, see his full bio over here.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.

Comments are closed.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide