Yellow Pages Usage Stats Are Likely Wrong
I was reviewing print yellow pages book usage statistics this week, in preparation for the SMX Local & Mobile conference, and I was struck once again by the large numbers that the Yellow Pages Association (YPA) touts in their press releases and research papers. I’ve had some individuals at conferences dismiss the idea that print usage could be dropping much, based on these YPA’s figures, and it’s made me wonder if there could be some degree of error involved in the data sampling methods used by the research companies behind the reports.
With very little checking, I found my suspicion has basis: some of the widely-cited yellow pages industry research is missing critical information, according to the data collection methods published.
Print usage remains stable?
For instance, the YPA’s spring announcement of last year’s yellow pages usage figures state that “…print usage remained stable with 13.4 billion print Yellow Pages references, unchanged from 2006, according to the 2007 Knowledge Networks/SRI (KN/SRI) Industry Usage Study…”. The PR also states that “Approximately 87 percent of the U.S. population used the print Yellow Pages in 2007.” Both of these figures struck me as a little odd, because conventional wisdom of many has been leaning towards the concept that print YP usage is dropping year over year, as internet search and mobile phone search is increasing.
In 2007, the YPA had reported a drop along the lines of 7.6 percent over the year 2006 in print book references. It seems odd to me that 2008 we would see no continuing slide in the amount of print references. Charles Laughlin, an analyst with the Kelsey Group, commented on this possible discrepancy, stating:
The methodology used to measure usage is such that a big one-year swing may not be as significant as the trend over time. One ongoing question is whether there is a floor for PYP usage, based on the limits of Internet penetration and the advantages of PYP, at least in certain categories. The pattern over time has generally been periods of stability followed by periods of decline. So rather than a single floor, there may be a series of floors as the mix of users and media choices evolves over time.
If I might paraphrase, I think Charles is suggesting that the figures over time might show a stairstep pattern of an overall downward trend for print — so the seemingly-stable usage of print YP from 2006 to 2007 is perhaps understating the fact that users are likely continuing to reduce using print over broader time.
Rather than a fuzziness of statistical distribution, I’ve had a much stronger suspicion that the research itself is missing sampling of a large demographic of the population.
Data based on “telephoned” households
According to Knowledge Networks, they gather data by polling a sample set of the U.S. population through generating a set of random phone numbers distributed through markets across the country. For the spring research figures, they called 9,008 people, compiled that data, and projected it as statistically representative of the entire U.S. population.
The problem I have had with this methodology is that this sort of phone polling misses households which have ditched landlines in favor of only using their cell phones.
The fact sheet for Knowledge Networks’ YP Market Reporter states that the info is based on telephoned households:
Based on a projectable probability sample of
adult respondents who live in telephone households, YPMR is grounded in sound survey research methods and adheres to the ARF guidelines for audience measurement. Individual market reports are available and based on a sample of adult consumers located in the targeted Yellow Pages directory distribution areas.
Knowledge Networks anticipated my sort of criticism, and in their YP Directory Audience Measurement White Paper, they state that of the estimated 8% of households without landlines that are missed in their sampling, the 2% entirely without phones are not likely to be potential customers (I’m supposing), and the 6% that are cell phone only households are not statistically significant:
The approximate 8% of households not covered in RDD sampling is comprised of households without landline phones (2%) or cell phones only households (6%). Those households without any phone line may be less pertinent to directory share estimates and cell phone only households represent a small coverage issue (6%) and are unlikely to introduce bias into share estimates due to the small contribution to the total.
Just as a common-sense check, I might try to work out how much those overall figures relate to real people using the yellow pages over the year. If we assume that the 13.4 billion print references are projected out based on the sample set of 9k individuals, and if we assume that Knowledge Networks was projecting that number out based on the U.S. population of 301.6 million people, the 87% of the population who used YP in 2007 would have averaged about 51 print references per each person during the year. That equates to each of those individuals looking at a yellow pages book about once per week during the year — a somewhat high number, I think, for most consumers, and it would be an even higher number of lookups per person if we extracted out children and other groups unlikely to be using the yellow pages from that overall population figure.
Now, I could possibly take issue with the overall trustworthiness of the report aside from the non-surveyed population segment, because I think the sample set of people polled are a relatively small percentage of the overall U.S. population, and the standard error could be quite significant — even more so when this 8% gets factored in. For instance, the U.S. Census estimated the population of the U.S. to be at 301,621,157 as of July 1st last year, and the YP directory usage research was based on only 9,008 individuals.
I do understand that we can discount children and the extremely elderly from the overall population figures, and the people who gave telephone interviews were likely asked to provide information on how much their overall household used the yellow pages, but I think the numbers still indicate that there can be a large degree of potential error involved in these projections before you even take into account that missing 8%, purely based on the math involved.
Whenever you base statistics on projections from a sample group, there is a mathematically embedded percentage of error that can be assumed in all the resulting figures. We’re not told how much Knowledge Networks’ statisticians have computed this percentage to be for these reports, but I’d hazard a guess that it’s quite a few percentage points.
Are users of yellow pages looking up something in their yellow pages book as much as once per week, on average? Considering that most consumers frequent familiar providers, I’d think this number was a little high.
Are the missing households significant?
But, let’s go back to the basis of their assertion that the number of cell phone only households that are missed by their surveying is insignificant. If it was only 6% of the population, and if that 6% used the print yellow pages at the same rate as the rest of the people surveyed, this might be true. But, I’ve been concerned that it could be far more than 6% who have ditched landlines in favor of their cell phones, and I suspect that not having that 6% in the statistics could bias the stats in favor of the print YP.
I think that people who have landlines-only are likely to be older population who may not’ve even adopted the internet or who use it less. Common sense also tells me that people who use cell phones only are perhaps “early adopters” of technology and may be far more inclined to be using their computers or cell phones to search for businesses and information, rather than using the print books.
As it turns out, this is likely a valid concern. We’re fortunate that the current presidential candidate races have resulted in a whole lot of attention being placed on telephone polling methodology. Political polling has this very same issue: pollsters survey a random sample set of a population in order to project out what percentage of that population is likely to vote for one candidate or another. Observers have questioned whether the possibility of missing cell phone only households could significantly bias results one way or another, and a number of researchers are saying that this is indeed a valid concern.
Cell-phone only populations growing
Mediamark Research has issued a research paper on this subject, The Birth of a Cellular Nation, and they state that the cell phone only population has grown to 14%. Newer figures have the percentage at closer to 16%. Demographically, according to them, 32.3% of 18-24 year-olds live in households with a cell phone but no landlines, as do 27% of adults designated as “single, never marrieds”.
The Center for Disease Control’s “National Health Interview Survey” for 2007 states that cell-only households comprised 13.6% of American homes, up from 12.8% by the end of 2006 (indicating to me that print YP usage figures from last year were potentially overcounted by some significant degree, too), and
The Pew Research Center finds that polls that miss the cell phone only segment are likely to underestimate projections involving technology adoption:
Perhaps not surprisingly, excluding the cell-only respondents also yields lower estimates of technological sophistication. For example, the overall estimate for the proportion of 18-25 year olds using social networking sites is 57% when the cell-only sample is blended with the landline sample, while the estimate based only on the landline sample is 50%.
A New York Times article this month quotes Stephen Blumberg, a senior scientist at the National Center for Health Statistics as saying that he projects cell phone only households to grow to larger than 25% by the end of 2008. They also quote Martin Frankel, Professor of Statistics at Baruch College, stating:
Until Internet polling gets a decent sampling frame, telephone surveys are necessary, and we can’t exclude cellphones from telephone polling.
How does this impact industry research figures?
What we can conclude from all this is that the industry research figures are quite likely off by some significant degree, because they do not include a representative sample from the cell phone only segment of consumers. Print YP usage likely did decrease further between 2006 and 2007, unlike the reports that it remained stable. And, the percentage of the U.S. population who used the yellow pages in 2007 was likely smaller than 87%, and there were likely fewer references to YP books than the stated projection of 13.4 billion.
It’s not really the fault of the Yellow Pages Association that their research data is undermined by incomplete sampling — it’s more the responsibility of the firm that they have used to provide research, Knowledge Networks. The YPA’s ARF guidelines for audience measurement don’t really go into details on regulating audience component and sample size, and perhaps it should. They should also require that the research clearly state percentage of estimated error with results, in my opinion.
But, with the increased scrutiny happening with telephone polling lately, Knowledge Networks is almost certainly aware of the potential undercounting issue, and should’ve moved before now to close the gap. I’m not really trying to beat up on anyone here, but these stats are widely publicised and they should be assessed and scrutinized carefully before being pushed out. No one wants to be selling the emperor’s new clothes, and businesses should be provided a clear understanding of what value they will receive for their advertising dollars.
In Knowledge Networks defense, I’d say that just a few years ago the number of cellphone-only households was a much smaller fraction of overall population and likely was statistically insignificant. I think they unintentionally failed to review this assumption each year, and the number has now grown to the point where it may be impacting their survey results.
Knowledge Networks has more recently issued in April some YP research that’s based on surveying of internet users (they refer to this sample set as their KnowledgePanel), and it sidesteps the issue of missing cell phone only households that is happening with their other YP market reports. But, this newer research has a stated sample size of only 2,962 individuals, and the prob with that is that it would have an even larger percentage estimated error as a less-representative sample set of the overall U.S. population. Their figures indicating that “nearly two-thirds of the adult U.S. population who shopped online in a given month also referencing Print Yellow Pages in the same period” still seems a bit surprising, just from a gut-check to me. Two out of every three adults who shopped online in a month also looked at a print yellow pages? Maybe.
Popular culture and some analysts now seem to believe that print YP is on the decline in usage and may go the way of the dinosaurs . While popular culture can easily be wrong, the usage research needs to be much more rock-solid or statements and research papers flying against conventional wisdom will not be taken seriously and will be questioned.
I don’t really believe that print will become entirely extinct. I do think it’s going through a painful transition at the larger print YP publishing companies. But, there are signs that smaller independent print directories and specialty directories are still enjoying growth and stability, and I’ve seen that print is on an upswing in a number of foreign countries. Print directories are going to remain a big business for a long time, though I see some collapsing of the domestic U.S. market as being inevitable, particularly where larger directory companies are concerned. Greg Sterling points out that margin pressures have increased for print media, and those companies will certainly be pushed into making changes to adapt.
On the more positive side, usage of internet yellow pages remains strong with top YP and other local info sites staying aloft in the most-visited internet properties, according to comScore.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.