Looking to do research based on data gathered from across the web? That’s one of the purposes of Common Crawl, and the group has just released new data, as well as a contest to encourage use of that data
Common Crawl is also currently running its first-ever Common Crawl Code Contest challenging developers to do something innovative using the data relating to job trends or social impact analysis. Three winners will each get $1,000 in cash, an O’Reilly Data Science Starter Kit, one year of GitHub’s Small Plan and more. Submissions are accepted through August 29.
FYI, I’m on the advisory board for the non-profit group. There’s no compensation for that involvement. I and others just offer free advice to the group.
Common Crawl’s data from 2011 was recently used by Zyxt Labs to show how much Facebook has spread across the open web. See our Marketing Land article for more on that: