Google Realtime Search & The Aftermath Of The Google-Twitter Split

Last Friday, Twitter quietly shutdown its “firehose” of tweet data that was being piped to Google. Like a gas station no longer getting deliveries, Google in turn effectively had to hang a “Closed” sign on its Google Realtime Search service. What happened, and what’s next for those who depended on Google to get some of their Twitter gas? Some thoughts and advice, below.

Topsy Provides Twitter Archive Search

First, there’s a great alternative to Google Realtime Search: Topsy.

Indeed, the company has just put out a blog post reminding the world that it’s the only Twitter archive search service left standing.

Fair enough. That’s totally correct. The company expanded its coverage back August 2010, and my review below explains more about some of the features it offers:

In fact, Topsy allowed you to go farther back than Google did. Google had promised that it would extend its index earlier than February 2010, but I don’t think that really happened.

Topsy tells me its index still goes back to May 2008, as I originally reported.

Bing Has The Firehose, But No Real Archive

Unlike Google, Topsy still has access to that “firehose” data from Twitter (and won’t reveal any more details than that). That’s why it’s still ticking along.

Microsoft’s Bing search engine also has firehose access. However, Bing Social Search doesn’t really let you go back more than a few days.

Bing’s tool is more designed, like Twitter’s own search engine, to allow you to search about what’s currently being said through Twitter and other update services at the moment. It’s not aimed at providing some type of historical search service.

Tweet Origin Tools

Some of those missing Google Realtime Search may be trying to track a popular topic on Twitter back to its origin. What The Trend may help here, and I’ll try to gather some others in the future. Here are also some past articles on this topic:

Where’s Twitter’s Own Archive?

At this point, you may be wondering why Twitter doesn’t make it possible to search through its own tweets for as far back as it has them. Yes, that does seem kind of crazy. However, it’s a conscious decision that Twitter has made.

Twitter has repeatedly told me, and others, that it wants to create search products that it thinks are more important to its users and that partners aren’t providing.

As Mike Abbott, Twitter’s vice president of engineering told me last year:

Google doing it [archive search] takes some of the pressure off. Where do we want to innovate in this world and drive unique set of experiences?

There’s no doubt that Twitter has build some great search tools. These articles have a bit more about that:

And these articles talk more about the issue with archive searching and Twitter in general:

As For The Library Of Congress…

By the way, you may recall that Twitter has been sending tweets to the US Library Of Congress. While that is an archive of sorts, it’s not one that anyone can search.

Also, just a little privacy reminder. While you can delete tweets, you’ve effectively only got six months from when you make a public tweet to prevent it from being stored with the Library Of Congress.

There’s a six month delay in the data they receive. After that, there’s no mechanism to prevent your tweets from later being discovered by Logan and Jessica when they stumble into the ruins of DC in the distant future.

What’s Left At Google?

Google still has access to any tweets that it finds through its regular crawling of the web. That means if you’re doing a regular Google search, you might find tweets that way. It just won’t be as focused, so you might find it helpful to use some of the search techniques covered here:

I’m checking to see if Google has any hidden commands that might help. One of the best is doing:

site:twitter.com/accountname

That type of search restricts a search to tweets from a particular person.

Expect Delayed Tweets At Google

However, in trying that today, you can already see problems that Google’s having now with tweets:

These are all tweets that I made yesterday. Nothing I tweeted this morning (I’ve done at least four tweets) is showing up. Worse, you can’t even tell what these tweets are about, as there’s been no title automatically created for them.

When I went looking for one particular fresh tweet of mine, I couldn’t find it, though oddly, I did get shown someone retweeting it:

Annoyingly, I’ve also found some cases where aggregators show up when my own tweet doesn’t:

That leads over to the Inagist site, which I never heard of before, and which apparently embeds the photo I uploaded through Twitter to yfrog. Or something. It kind of makes my head hurt.

All I know is that I don’t find my tweet, which is a problem for Google, but also for Twitter. But let’s stick with the Google problems, for now.

Twitter, Google & News Shares

Google also uses Twitter data in a variety of other ways. One way had been to show the number of shares of news articles or updates that people were doing related to a news topic. Some examples of these are in our article from last October:

Looking today, I see less of this. But occasionally, these do appear:

If you try to drill in, you get an error:

Twitter, Google & Social Search

Google also taps into Twitter for its Google Social Search service, both to help create connections and to help surface content that is being shared on Twitter by those in your network. Our story from February covers this more with some examples:

Looking today, I can still see this working, where Google is clearly seeing things that are shared via Twitter through its ordinary crawling.

Look at the last line below, and you can see how Google flags this story as being shared on Twitter:

But interestingly, I also noticed something new today:

There, you can see how when I hover over “Matt Cutts” in the “shared this” area, I’m told I’m connected to him through the new Google+ social network.

While Google+ has mainly seemed a way for Google to collect data it feared being locked up within the walls of Facebook (see Steven Levy’s excellent Wired piece for Google effectively confirming this), it suddenly is providing a useful backup for Twitter, too.

For now, that backup mainly seems to be helping in forming social connections. But in the future, it could be that Google Realtime Search might return powered by posts from Google Plus.

Loss Of Link Juice

Finally, Google has used the sharing on Twitter as a form of ranking signal to help determine the quality of content it lists. This was a bigger impact for results in Google Realtime Search, but it was also used in other ways. Our story below has more on this:

Over at SEOmoz, they’ve been trying to test what if the loss of the firehose may have impacted SEO efforts. I think the results are fairly inconclusive, but you may want to check them out.

One big change is that tweeted links are back to being nofollow — IE, not passing link credit.

As my What Social Signals Do Google & Bing Really Count? article explains, in the Twitter firehose, links didn’t have nofollow attached. That’s a lot of link juice that’s just evaporated. It’s unclear what the impact will be for publishers and Google alike, yet.

Says Google…

I talked with Google’s Amit Singhal — who oversees all of Google’s search products — about the impact on the Twitter firehose being closed. He said that Google won’t catch tweets as quickly as in the past, though he said the delay would be of one of shifting from seconds to minutes. My testing above suggests it’ll be much longer than that, in some cases.

Singhal also said that Google probably won’t have as comprehensive collection of tweets as it did in the past. While technically, Google has the capacity to crawl Twitter’s site and gather up all the tweets when they happen, he figured that would probably crash Twitter. Search engines generally try to be “polite” when crawling and not gather data so quickly as to impact a site’s human users.

In terms of social search, Singhal wasn’t certain the impact that the Twitter change might have on things yet. But he did say that Google was already having to calculate the number of shares, or tweets, that a particular page on the web may have gained on its own. That means Google can continue to create those counts, though it may take longer for it to understand the full counts and how quickly something is being shared.

I also asked about the loss of Google Realtime Search. It launched as part of a big Google press event, with some realtime results injected directly into the main results. There was a lot of fanfare over how important and useful this was. With it gone, isn’t Google losing something?

“Ideally, we would still have a partnership,” Singhal said. “But we’ve decided in all, we’re OK with the current state of things.”

Singhal also clarified that the firehose was turned off on Twitter’s side, as a result of the agreement not being renewed. Google felt that Google Realtime Search had to close entirely, because it depended so heavily on Twitter content, even though other realtime content was also part of it.

Do People Really Miss Google Realtime Search?

Google might be right about the “getting along OK” part. Yes, I miss Google Realtime Seach. Google tells me they’ve also had journalists begging for it to return and had to explain they can’t do anything without an agreement. Nicholas Carr penned a piece about having the “shakes” from this sudden realtime search withdrawal.

But in general, practically no one seems to be complaining. There was no barrage of “what’s up” tweets that came out when the service suddenly closed. In contrast, Google’s change of its navigation bar to black seems to have generated much more discussion.

My news editor Barry Schwartz, who constantly scans Google’s own forums and search forums across the web for his own Search Engine Roundtable site, tells me that he figures complains about realtime search being gone are only about 5% of those about the new navigation bar — if that.

It reminds me of when Google couldn’t reach a deal with the Associated Press last year. For about a month, AP content disappeared from Google News. Virtually no one noticed.

Still, a bigger test will come with breaking news events, I’d say. When actress Brittany Murphy died, it was the first big test of how realtime results could improve Google’s relevancy, and they very much did. Google seems to have lost something useful, I’d say.

It’s Also Twitter’s Problem

Of course, Twitter’s missing something, too. As I explained above, it’s not particularly nice to search for your own tweet and not be able to find it on Google, if that’s where you choose to look. Plenty will be looking there, I’d say, because Twitter has effectively trained them to do that. Nor has Twitter, so far, tweeted or posted anything about what people should do now if they want historical tweets.

More important, Google remains a powerful traffic driver. Now, instead of people finding tweets and ending up back at Twitter, they may show up at official aggregators or unofficial scrapers. That doesn’t seem to help Twitter’s bottom line, nor does it seem a good user experience.

Twitter, by the way, isn’t saying more than what I initially reported

Since October 2009, Twitter has provided Google with the stream of public tweets for incorporation into their real-time search product and other uses. That agreement has now expired. We continue to provide this type of access to Microsoft, Yahoo!, NTT Docomo, Yahoo! Japan and dozens of other smaller developers. And, we work with Google in many other ways.

What Happened?

No one’s saying why the agreement was allowed to expire. While it was signed at the same time that Microsoft’s was, the Microsoft agreement is continuing. I get the impression this wasn’t because it was renewed but rather that the Microsoft deal hasn’t expired yet.

Microsoft told me, about about the deal:

We won’t disclose the terms of the deal, but it’s a long term arrangement that we’re pleased with, and plan to keep in place as long as it’s delivering benefit for people who use Bing.

I got a tip after the news broke that perhaps sheds more light. I was told that the rumor is, according to several CEOs who run search start-ups, that Google was negotiating to renew the agreement for two years at $35 million per year, or $70 million in total.

Now that’s not much for Google. But it’s likely a huge amount for Google given that the company pretty much doesn’t pay to license anything.

The last deal with Twitter, rumored to be for $15 million, was pretty unprecedented. Carrying Twitter’s ads on Google, with Twitter’s branding, definitely was. Twitter Promoted Tweets Come To Google explains more about this.

So maybe it wasn’t about the money as much as other issues. That leads to something else the tipster told me: that Microsoft is apparently not that happy with its Twitter relationship, not seeing the value in paying for it when smaller search startups get firehose access for free, and that it might just drop Twitter and license the data out from a third party.

Caveat time. I’ve not had a tip from this person before, so I can’t vouch for it with a history of this person always being right. It could, for all I know, be entirely off the mark. If anyone knows better and wants to share more, get in contact. I’d love to hear.

I did go back to Google, Microsoft and Twitter with this information. Twitter just reiterated what it said before. Google did the same. Microsoft gave me the statement above about its deal generally and said it doesn’t comment on rumor and speculation.

Actually, Microsoft does comment on rumor and speculation, as do Google and Twitter, whenever they decide its in their interests to do so.

I also asked Microsoft if there were any plans to create a comprehensive Twitter archive search and was told:

We’re not discussing future product plans, but we don’t have any immediate plans to create a deeper archive.

That’s all I know, at this point. If you’re looking for those older tweets, definitely check out Topsy. As for Twitter and Google, I guess it’s stay tuned.

Links Mentioned In The Article

Related Topics: Channel: Social | Features: Analysis | Google: Google+ | Google: Real Time Search | Microsoft: Bing | Microsoft: Bing Social Search | Search Engines: Real Time Search | Top News | Topsy | Twitter: Search

Sponsored


About The Author: is a Founding Editor of Search Engine Land. He’s a widely cited authority on search engines and search marketing issues who has covered the space since 1996. Danny also serves as Chief Content Officer for Third Door Media, which publishes Search Engine Land and produces the SMX: Search Marketing Expo conference series. He has a personal blog called Daggle (and keeps his disclosures page there). He can be found on Facebook, Google + and microblogs on Twitter as @dannysullivan.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.gamerstube.com Joe Youngblood

    it makes sense to me. why pay if no one else is? seems like a bad model on twitters part. of course they want google to pay because google is also trying to compete in their market (google+) why help out a competitor?

  • http://www.michael-martinez.com/ Michael Martinez

    I miss the realtime search results because they were a good indicator of how “hot” a topic might be.

    Google could still allow the Tweets to have a direct impact on its search results by arbitrarily disregarding the nofollows on Twitter accounts it was allowing to pass value before. I believe you have reported in the past that only SOME Twitter accounts were actually allowed to pass value.

    Google should have a historical record of which accounts those were and they could construct a seed set or maybe even a training set for a Panda-like algorithm to find and vet new Twitter accounts.

    As many SEOs have been quick to point out, there is no law requiring Google to actually honor the “nofollow” attribute. Google made up that rule and Google could certainly work around it.

    This may be a situation where it would make sense for Google to figure out something.

    Meanwhile, people can look for alternative methods to surface their Tweets and help them pass value.

    Google may have just created a whole new class/generation of Tweet spam. “Welcome to the Caribbean, love!”

  • http://postpo.st PostPost

    Hi Danny. Great summary of the state of Twitter search.

    We also love Topsy but want to clarify that it’s not “the only Twitter archive search service left standing”—PostPo.st is still standing. Today, we go back 3200 Tweets for PostPo.st users and 400 for the people they follow (if they’re not PostPo.st users).

    We’re at the tail end of our beta, so it’s no fault of yours that you don’t know about us: we’re the search engine that returns historical Tweets from you and the people you follow on Twitter.

    As for the merits of a searchable Twitter archive, it’s amazing how engaging the content is when you’re searching through and filtering by the people you follow, and the types of content they share.

    Here’s an example, rich with links and photos:
    http://postpo.st/bradnoble/search?q=endeavour

    You’re welcome to give us a try.

  • http://paulgailey.com Paul Gailey

    No mention of the commendable Backtype service that Twitter acquired this week?

    Another alternative (moreso for the enterprise/brand marketer is the powerful DataSift for Twitter firehose querying and search)

  • http://www.shonmckee.com Shon McKee

    I was wondering what happened to that. I used it quite a bit.

    Topsy is cool but IceRocket is really good. It searches not only Twitter, but blogs and Facebook.

  • http://www.infodocket.com/feed/ gary price

    Hello Everyone, Gary Price here.

    I’m a Contributing Editor that recently joined the SEL team.

    I plan to do a blog post about a few Twitter archiving databases and tools in the next week. I will take a look at the resources you’ve mentioned in the comments and a couple of others. I would love to learn about others.

    Here’s a quick look at two resources I will take a look at.

    1. SnapBird
    http://snapbird.org/

    This search tool allows users to search by keyword using a specific Twitter handle. You can also search for DM’s that you sent and received along with tweets and favorites of your friends.

    I’ve only used SnapBird for a short time and I’ve encountered a few glitches. However, when it worked it I was able to find what I was looking for.

    The oldest tweet I found in my searching was from October, 2009.

    2. TwapperKeeper (TK)
    http://twapperkeeper.com

    The service is free (they also offer extras for a fee) and has been around for a few years. I use it regularly and it’s great. This is the place to go if you want to create an archive of tweets using your handle, a keyword, or a hashtag.

    All archives are public and you can search for them from a link on the TwapperKeeper.com home page. At this time you can access and search more than 25K archives containing more than 3 Billion archived tweets.

    Here’s a searchable archive of every tweet (more than 85,000) using the hashtag #smx back to March, 2010.
    http://twapperkeeper.com/hashtag/smx

    Learn more about the TK fee-based service here: http://twapperkeeper.com/premium.php

    You can download the software and create archives on your own servers.
    http://your.twapperkeeper.com/

  • http://irwebreport.com IRW

    Google real-time search email alerts are another casualty of this. Not sure how much people used real-time alerts, but they were amazingly fast for monitoring brand mentions and other high value topics..

  • Joe Russ Bowman

    While I have no plans to implement a real time streaming search (and will do a blog post later to explain why), my still in heavy development search site http://www.unscatter.com integrates with Topsy as well as other search engines like Blekko. Topsy results are featured on default search queries and the search engine also includes support for Blekko slashtags plus some custom ones I’ve written such as /twitter and /facebook

    I plan on much deeper integration with Topsy in the future, after I finish some back end and front end work to better implement caching.

  • http://postpo.st P.P.

    @ IRW check out @twilert at http://twilert.com – they do what Google RealTime email alerts used to do.

  • http://www.postlinearity.com gregorylent

    i want all my tweets .. i will pay for it .. the learning curve and the record is worth it ..

    how can i do that?

  • http://klout.com/#/jdrch Judah Richardson

    Nicely done, this is the best writeup of this issue I’ve read so far

  • http://niute.ch N.T.

    Why have you mentioned Topsy and haven’t mentioned WhosTalkin or Socialmention or other real-time search engines

  • http://www.pagerank-seo.com Robert Visser

    One of the great losses with the expiration of the agreement w/ Twitter & while Google Realtime is revamped are the links to breaking news.

    What I’ve found substantially less useful since Google Realtime was removed last Sat. is checking the Google Alerts for both my clients and my own business. Many of these alerts were set-up specifically to check mentions & links for keywords & profile names. The absence of these goes beyond just Twitter. It includes sites with which Twitter cross fed Timelines/content. These include Friendfeed, Identi, Plurk, Pixelpipe, etc.

    While I wouldn’t want to speculate on whether Google would have attributed anything near the boost we received in the SERPs from Twitter to any of these other social media services, that they’re now absent (or at least diminished) from Google Alerts potentially has broader impacts. CRM & online reputation monitoring, etc.

    We’ll have to see what’s reintroduced & how Google+ will be integrated. I’m disillusioned with the algorithm the Google+ team is running on Sparks. In my opinion it needs a serious jolt of caffeine in the form of integrating Realtime results.

    I was pleased to see the addition of Social Plugin Tracking in Google Analytics http://goo.gl/CYruJ .

  • Darryl Lee

    As always, thanks for the great reporting, Danny. I’m a long-time reader.

    As an end-user I could care less about real-time results in Google, especially since most of those results are available on Twitter itself.

    But I’ve been hating Twitter’s lack of historical search for a long time now. Topsy is pretty good, but I suspect they’re only indexing people with a minimum number of followers. Because I can’t find my old posts. Same thing for snapbird, twapperkeeper, and postpo.st. Which is fine. I don’t expect a third-party to waste resources archiving all the times I’ve tweeted trying to win an iPad2. But I do expect that Twitter itself ought to make it easier to search a very finite number of tweets, since I’m limiting it to just my username.

    @gregorylent, all your posts are at http://twitter.com/#!/gregorylent (assuming that’s you), but I suspect that you really mean you’d pay to have them be searchable beyond 1 day. I hear you man, I hear you. But apparently nobody else cares about looking back, just what’s happening now now now.

  • http://postpo.st P.P.

    @Darryl Lee,

    Have you searched PostPo.st for “ipad 2″? I think you’ll be amazed what you find.

    Please let us know on Twitter @postposting.

  • Darryl Lee

    Thanks @postposting — can I just call you PoPo? I loaded myself up, and it’s nice, but it doesn’t find this one: http://twitter.com/#!/notyoutoo/statuses/5901613258

  • http://www.sergiozaragoza.com SergioZaragoza

    Great article. The worst part is the shutting down of Google Real Time Alerts on google alerts system. This emailed me real time buzz search results on my customers names and products. either with ther @account or just the name… thats the worst part.. any other solution for that??

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide