In 1997, a computer called Deep Blue beat world chess champion Garry Kasparov. Headlines triumphed about the victory of machine over man, as we humans were “conquered”, “vanquished” and, as a result of our defeat, “stunned.”
The real question isn’t why we finally were defeated by a chess playing computer, but why it took so long. Chess is a game that computers should excel at.
The whole point of the game is analyzing possible moves and picking the one that yields the highest probability of a successful outcome. That’s what computers do.
It’s actually rather amazing that humans stacked up so well against the best that IBM could throw at it for as long as we did. The 1997 match was not the first duel between man and machine.
It was simply the one the machine won. Previous to that, Kasparov and others had consistently bested the most powerful computers in the world. And even the 1997 match wasn’t a blow out. Deep Blue won 3 and a half matches to 2 and a half.
How could we compete against something that could process 200 million positions a second? The human brain can’t come anywhere near that level of mathematical dexterity. Our ability to mathematically evaluate positions can be numbered in the dozens per second. If playing chess was all about processing math, we would have been bested long before 1997.
But humans are extraordinarily good at split second processing based on intuition and pattern recognition. What Kasparov could do by instinct took millions of MIPS (Million Instructions Per Second) of processing horsepower.
In fact, Kasparov protested that Deep Blue had an unfair advantage in that it was able to analyze hundreds of Kasparov’s past matches, looking for patterns, where he didn’t have the same advantage. Also, Deep Blue didn’t do it alone. IBM programmers were allowed to go in and tweak programs in between matches, keeping Deep Blue from falling into traps laid by Kasparov. It’s actually amazing, when you consider the odds stacked against Kasparov, that he did as well as he did. And it wasn’t because he was a better machine than Deep Blue. It was because he was human.
In the 14 years since the match, computers have become exponentially more powerful. And, if we’re benchmarking computer performance against humans, they needed to substantially raise the bar. Because, unlike chess, most of the things we humans do deal with ambiguity and nuance. We were built to deal in messy and uncertain environments. If the advantages humans have allowed us to compete against a computer in a test as mathematically precise as chess, imagine the advantage we have in the organic world.
This Is Jeopardy
It’s that world of ambiguity, represented by human language, that IBM chose as its most recent man vs. machine challenge. The game show Jeopardy provided the forum, and this time it was a machine called Watson that was the challenger. Watson came to the Jeopardy stage, prepared to take on the all-time champs, Ken Jennings and Brad Rutter.
Jeopardy presented a much bigger challenge to IBM than chess did. To win, Watson had to be able to understand human language, especially difficult given that Jeopardy turns the typical grammatical structure on its head, providing the answer and asking contestants to provide their response in the form of a question.
It we were just measuring the ability to store data (something we humans call memory) there would be no contest. Watson would blow us away. The entire recorded history of man could be stored in its memory bank.
For humans, the limiting factor was the amount of trivia we could cram in our cranium. But for Watson, the challenge was interpreting the question and knowing which information to access and present back as a response.
One of the biggest challenges IBM has ever taken on (the same problem, incidentally, that Google struggles with every day) was something we humans do instinctively, without thinking. It’s another example of how astoundingly efficient our brains actually are.
The Human Part Of Usability
My point, and there is one, is that we consider user experiences and test usability, we have to have a fine appreciation for what makes humans human. All too often, usability testing relies on reams of data, crunched and analyzed in a zillion different ways. We examine bounce rates and benchmarks, as if our users were machines and the answers we seek can be arrived at mathematically.
The irony of usability is that, most often, we try to understand what humans want without ever talking to one directly. We rely on a spreadsheet to reveal the mysteries and subtleties of the human condition. We reduce the magnificence of the human brain to nothing more than a machine, something that can be understood by examining inputs and outputs.
Let me give you one brilliant example of true human based testing I once heard at a conference. Motorola Senior VP and Chief Marketing Officer Eduardo Conrado was talking about how they test radios used by emergency response teams.
Motorola was testing a new model that had just been released. The radio had already gone through their extensive in lab testing and design process. The prototype was now ready for field testing. This is when Motorola actually goes out on first response calls and watches how their radios are used in real life situations.
Despite all the previous testing, Motorola’s researchers soon realized they had a problem. As part of the redesign, they had tried to reduce bulk, introducing a smaller, more efficient radio. The reasoning, which was logical, was that the first responders would appreciate not having to carry around bigger radios. But there was a flaw in that reasoning.
It’s only obvious once you see it…
First response situations are incredibly stressful. They demand an extraordinarily high (sometimes super-human) level of performance on the part of the response team. The human body prepares for this anticipated demand on its resources by cranking up its metabolism. The heart starts beating faster. A lot faster.
Based on a study from Indiana University, it found that during a fire, a firefighter’s heart rate can approach 100% of their maximum for prolonged periods. By comparison, a world-class marathoner typically runs 85 to 90% of their maximum heart rate in a race. The body also signals the release of adrenaline and other neuro-stimulants to allow the body to perform in the required high-stress situation.
For the average first responder, the stress on their body while on the job would be the same as if they had just exercised full out for several minutes. Imagine then the challenge of trying to use a smaller, slimmer radio. The problem was immediately obvious to the field team – “The buttons are too small!” In the lab, the new design was perfect. In the real world, in the hands of real people, it was unusable. The crew’s hands were shaking too much to be able to use the smaller controls on the radio. The design was quickly modified.
So why are these human factors typically absent for much of what passes for usability testing? I suspect it’s because human factors are very difficult to measure. Thing like intuition, habit and emotion, all of which can significantly impact a user experience, can’t be quantifiably measured.
By their very nature, they require human interpretation. It’s the same reason why IBM’s Watson, for all it’s sheer processing power, still can’t chat with you about your upcoming vacation or how your kids are doing in school (which, incidentally, was proposed by Alan Turing as the ultimate test for artificial intelligence).
The only way to understand the human element is to use human-to-human methodologies. It can be a simple as observing behaviors of actual users, or as complex as a large-scale ethnographic study. Whatever route we choose to take, it’s essential that we not loose sight of this human element. We are not machines. We are far more than that. Consider this for a moment. It took IBM researchers and engineers years to create a machine that could best Garry Kasparov at a game of chess.
Eventually, they succeeded. But it was a machine that could only play chess, albeit at a very high level. Kasparov could also protest totalitarian regimes, write books, file a patent application and, one supposes, show love, reciprocate friendship, reflect on sunsets and appreciate art. Deep Blue, or Watson, or any other machine, has never accomplished any of those things.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.