Learning SEO From Building A Web Crawler

There is no doubt that you can learn a tremendous amount of information on search engine optimization (SEO) by reading sites like this or ones in our blogroll, but there is always a lot to be learned from getting your hands dirty. Now, you can get your hands dirty by experimenting and trying SEO techniques […]

Chat with SearchBot

There is no doubt that you can learn a tremendous amount of information on search engine optimization (SEO) by reading sites like this or ones in our blogroll, but there is always a lot to be learned from getting your hands dirty. Now, you can get your hands dirty by experimenting and trying SEO techniques out on sites and you can also learn an incredible amount by trying to reverse engineer a web crawler by building your own.

In fact, Google Webmaster Analyst, JohnMu, tweeted this morning stating that fact. He said, “Want to learn about indexing/crawling? Don’t read – code a spider.”

That is exactly what SEOmoz did, they built a crawler and index of web pages to better learn about the internet, plus share that data with the industry. Linkscape was introduced in October 2008 and has grown to 44 billion web pages and 474 billion links.

Rand Fishkin of SEOmoz has posted the lessons learned from building an index of the web. So, maybe, in this case, reading about someone else’s experiences and findings in building such a crawler can help you.


About the author

Barry Schwartz
Staff
Barry Schwartz is a Contributing Editor to Search Engine Land and a member of the programming team for SMX events. He owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry can be followed on Twitter here.

Get the must-read newsletter for search marketers.