If Twitter were like an old-style McDonald’s, the sign outside would have changed to “1 Billion Queries Served” today. And to keep serving up those queries to humans and machines alike, Twitter is now using a new underlying search technology.
The news was announced in a post on Twitter’s Engineering blog today. Let’s take the technical change first, then get into the fun numbers.
New Search Engine “Under The Hood”
Twitter is now using a new search engine, actually has been using it for the past few weeks, the company said. But by new search engine, it’s not a new “look-and-feel” that’s been unveiled. Instead, this is about what’s under the hood.
It’s a change to the actual “engine” that lets you search, the software that takes your query, hunts through billions of tweets and brings back answers — often within less than a second.
Twitter says its new technology will last it for years and has enough horsepower to handle 50 times the amount of data that currently flows in via tweets. For the technically inclined, read Twitter’s post for more details about all that.
One of the biggest changes is shifting to the use of an “inverted index.” For those who’d like to learn more about inverted indexes and modern search engines, here are some resources I recommend:
- How does Google collect and rank results?, from Google, written originally for librarians, it’s a nice “plain English” overview
- On Search, the Series: Tim Bray cofounded one of the earliest — and briefly, one of the most popular web search engines — Open Text. He now works at Google. In this series, he goes through the basics of building a search engine.
You Can Go Back Further In Time
One of the shortcomings with Twitter’s own search engine is that over time, it’s gotten harder to search back past a few days. Earlier this year, I wrote in Where Have All The Old Tweets Gone? how it was impossible to find, by mid-January, any tweets from New Year’s Day in Twitter. There wasn’t the capacity to store older ones for searching, Twitter explained to me.
By mid-September, I was finding that Twitter Search wasn’t allowing you to find tweets more than four days old. In today’s post, Twitter says the new system allows you to search “twice as long.” I’m not sure if that’s a time reference or not, but I did find that I can now search back as far as seven days.
Seeking Tweets Older Than A Week? Seek Elsewhere
Want to search back further than seven days? Twitter Search isn’t your solution, nor does the company have any plans to significantly increase that period. Twitter told me in June (and reconfirmed this in August) that since other companies like Google and Topsy are focusing on deep archive search, Twitter aims to improve search on its own site in other ways.
You can read more about that in my story from August, Topsy: Now Searching Tweets Back To May 2008. That also includes a chart showing how far back Twitter, Topsy and Google go, along with some advice for finding “historic” tweets. Tweet older than a week are historic in Twitteryears, aren’t they?
The New Twitter Search, Look & Feel
One way that Twitter has already delivered on its promise to improve the search experience other than for archive searching has been through cool features that are part of the new Twitter look-and-feel that has been rolling out since mid-September.
If you don’t have the new Twitter look yet — or even if you do but haven’t explored it fully — our The New Twitter & Search, An Illustrated Guide takes you through the new functionality.
The changes have been a real hit with me. Twitter offers search in two ways, at the dedicated Twitter Search site and through a search box at Twitter.com. I used to prefer Twitter Search. But I’ve changed as I’ve been using search at Twitter itself, with the new look. The new features make it much easier to get further context about the tweets you find in response to a search. I hardly go back to Twitter Search itself, these days.
Let’s Do The Numbers!
The post also revealed some updated figures:
- There are over 1,000 tweets per second
- There are 12,000 queries per second
- There are over 1 billion queries per day (1,036,800,000 per day)
Taking the math further:
- There are 31 billion queries per month
Queries Versus Searches
Notice that Twitter talks about “queries per day.” I think they’ve used the word “queries” rather than “searches” deliberately, because unlike regular search engines such as Google, Yahoo and Bing, Twitter serves out a lot of “searches” that aren’t done by humans. Saying “queries” is a good way to make this distinction.
People do search for tweets at Twitter, of course. But many “searches” are also standing queries that are issued by Twitter clients. For example, any Twitter tool that brings back all the replies to someone’s account automatically is technically doing a “search” for the matches.
I like it. Let a query be any request, from human or machine. Let a search be a human-generated query. Now if we can get Twitter to tell us the number of searches that happen out of the overall query pool, we could better compare it to the major search engines.
To understand more about the automated queries that Twitter handles, I highly recommend reading my post from earlier this year, Twitter Does 19 Billion Searches Per Month, Beating Yahoo & Bing (Sort Of). It goes into more depth about all this.
However the queries are issued, by human or machine, they’re on the rise. Here are a few benchmarks:
- April 14, 2010: 19 billion queries per month
- July 6, 2010: 24 billion queries per month
- October 6, 2010 (today): 31 billion queries per month
That’s 63% growth in about six months! If I’m doing the math right, of course. Math is hard, especially on tape (that’s for you, Real Genius fans — including Chris Sacca, a Twitter investor. Work that into a trivia quiz!).
Queries Versus Google Searches
When Twitter first released its 19 billion queries per month figure earlier this year, I pulled comScore figures from December 2009 for worldwide search queries, to get a rough sense of how it measured up against other search engines. It came in second, but well behind Google.
There haven’t been updated worldwide search figures that I’ve seen recently, so I still only have December 2009 figures to compare to. That probably undercounts the other services. But still, it’s an interesting view. Twitter stays in second but now handles about 1/3 of Google’s human volume:
- Google: 88 billion per month
- Twitter: 31 billion per month
- Yahoo: 9.4 billion per month
- Bing: 4.1 billion per month
DANGER, WILL ROBINSON! OK, just remember. For Google, Yahoo and Bing, you’re seeing older figures and more important, figures for human-driven searches. Google has many APIs (a way for a machine or software program to get answers automatically) that handle “requests” just as Twitter does. If Google published an overall figure on that, I suspect it would dwarf what Twitter does.
That’s not to take away from Twitter’s improvements and growth. It’s just that Google has a huge ecosystem including advertisers and publishers that all pull data from it (all those AdSense ads you see? Each and every one issues a request when you load a page).
Tweets Versus Facebook Updates
Finally, Twitter said it handles 1,000 tweets per second now. That’s up 67% since it reported 600 tweets per second (or TPS, in Twitter’s own lingo), back in February.
When those figures came out, we compared the number of tweets to Facebook status updates. There are a bunch of caveats in comparing all this stuff, so see our earlier article about that: By The Numbers: Twitter Vs. Facebook Vs. Google Buzz.
Having warned that it can be hard to compare, how’s do things compare? I don’t know.
Previously, Facebook had reported 700 status updates per second, when we looked at its Facebook Statistics page. That stat is no longer reported. I can’t tell if there are now more tweets at Twitter than status updates at Facebook or not (I suspect not, Facebook has seen plenty of its own growth).
If I can get better stats from Facebook, I’ll update this story later.
By the way, apologies if there are a few typos I haven’t caught. I’m writing all this on a plane (thanks, United) using wifi (awesome, Gogo Inflight) returning from our Search Marketing Expo East conference (it was great, biggest ever, thanks for asking or at least I thought you did) pretty short of sleep. That also explains perceived punchiness in the article, too. But it’s all accurate. I’m pretty sure.
Other people are writing about this too. Need I say it? For a great collection of related content, see Techmeme.