As a search engine optimization (SEO) professional, I often use the word aboutness whenever I label and describe content. Aboutness plays a critical role in information retrieval systems. If a search engine determines that a document (web page, PDF, graphic image, video) is topically related to a searcher’s keywords, then a link to the document often appears in the organic search listings.
According to many information sciences professionals, the term aboutness has multiple meanings and layers. For example, many search professionals might be familiar with intentional, extensional and pluralistic aboutness. Let’s look at these terms individually and see how they are directly related to our jobs as search engine optimizers.
Aboutness: Search engines, users, and context
R.A. Fairthorne is credited with coining and defining the term aboutness back in 1969. According to Fairthorne, intentional aboutness describes the meaning of a document (such as a web page, PDF, graphic image or video) from the author’s perspective. Authors are able to state what a document is “about” by formulating an expression which “summarizes” the content of a document (Hutchins 1997). Summarization involves the selection of keywords or keyword phrases.
As SEO professionals, we understand keyword selection and placement. We describe the intentional aboutness of web pages in a variety of places, including but not limited to:
- (X)HTML title tag
- Meta-tag description
- Meta-tag keywords (if used)
- Page content
- URL (or web address)
- File name
The way we describe and label website content should ultimately communicate to both search engines and web searchers what the content is about.
For example, let’s take a graphic image. Since search engines have a difficult time accurately understanding the aboutness of a graphic image, they look at the context in which an image is used. What page is the graphic image used on? Is that page optimized (or partially optimized) for the content in that graphic image? If the graphic image is a photo of black hiking boots, is the photo used on product or category page about hiking boots? What is the file name of the photo? Is it a SKU number that only makes sense to the website owner, or is it more descriptive—making sense to searchers and search engines alike? What about the text surrounding an image? Believe it or not, captions can actually have an impact on search engine visibility.
According to Fairthorne, extensional aboutness is reflected semantically by actual units and parts of the text. In other words, how we present content on our web pages via titles, headings, sentences, captions, etc. influences the extensional aboutness of page content. In an ideal situation, both intentional and extensional aboutness should match. However, it is commonly known that a single term can have multiple meanings in different contexts, and that a single concept can be represented using more than one term (Ching et al, 1998). There is also pluralistic aboutness, when desired information can be labeled in multiple ways, or if one keyword doesn’t address the entity of the topic (Morville 2005).
Intentional aboutness, extensional aboutness, pluralistic aboutness—what do these terms ultimately mean to an SEO professional? And how can we be sure that the “correct” interpretation of a page’s aboutness is communicated to both searchers and search engines?
Measuring and communicating aboutness
Many SEO professionals feel that a web page’s aboutness is communicated simply by keyword repetition. If you use keywords many times on a web page, then clearly the page is focused on those keyword phrases, right? I wish it were that simple.
First of all, search engines haven’t measured keyword density as a ranking factor for a very long time. However, that doesn’t mean that web pages (and graphic images and multimedia files) shouldn’t contain keywords. Keywords are essential for communicating aboutness. But keywords should be placed judiciously so that the aboutness of the page is clear to both search engines and web searchers.
Here is a usability test I like to do called the 5-Second Test coined by the folks at User Interface Engineering. For this usability test, I present the web page to test participants for only 5 seconds and ask them what they believe the page’s content is focused on. If I don’t hear the most important keyword phrases? Then the page probably isn’t communicating the aboutness of the content—above the fold—where users view content first.
And how do we SEO professionals communicate aboutness of content above the fold? We write effective headings with keywords. We write effective title tags that reinforce headings. If a video or graphic image appears above the fold (in the main content area), we write a caption for that image or video that is interesting to readers and reinforces image/video content at the same time. Many of our breadcrumb links contain important keywords. Keyword repetition can be done effectively without annoying users/searchers or being perceived as spam by search engines (keyword stuffing).
Search engines also look for the aboutness of a web page via off-the-page factors. I like to think of link development as validation of a user/searcher mental model. If a searcher types a keyword phrase into Google, for example, the searcher wants to see his keywords validated in search results and those same keywords on the web page that he ultimately lands on. He wants to see his information scent (keywords) validated.
Well, this scent validation also occurs on pages other than search engine results pages (SERPs). If a person clicks on a link from Search Engine Land to another website, he wants to be sure that he goes to the right web page and website. If a person clicks on a link from a newspaper or magazine article, he wants his keywords validated and reinforced. Search engines want that, too. That’s why link development is also a part of aboutness.
However, Peter Morville also noted that the authority of the masses can redefine the aboutness of the object, even though there can be a good match between the words of the author and searcher. Mass authority can be a good thing, because authors are not always right, or it could be a bad thing, because the “wisdom of the crowds” is an often-cited fallacy.
What do you think should be used to determine the aboutness of web documents? Is semantic content the best way to determine aboutness? The wisdom of the crowds? Both? Neither? What do you think the search engines are missing? What are they getting right? Where are there areas for improvement? Please share your thoughts in the comments below.
For those of you who wish to read more about aboutness in the context of information sciences and information retrieval, here are some great reference materials that I’ve used.
- Browne, G. and Jermey, J. (2007). The Indexing Companion. Cambridge: Cambridge University Press.
- Bruza, P.D. et al. (2000). “Aboutness from a Commonsense Perspective.” Journal of the American Society for Information Science 51(12): 1090-1105.
- Chung, Yi-Ming et al. (1998) “Automatic Subject Indexing Using an Associative Neural Network.” In Proceedings of the 3rd ACM International Converence on Digital Libraries, 59-68.
- Fairthorne, R. A. (1969). “Content analysis, specification and Control.” Annual Review of Information Science and Technology, 4, 73-109.
- Hutchins, W.J. (1977). “On the problem of “aboutness” in document analysis.” Journal of Informatics, 1, 17-35.
- Hutchins, W.J. (1997). “The concept of ‘aboutness’ in subject indexing.” In K. Sparck Jones and P. Willett, Readings in Information Retrieval. San Francisco: Morgan Kaufmann, 93–97.
- Morville, Peter (2005). Ambient Findability. Sebastopol, CA: O’Reilly Media, Inc.
- Taylor, Arlene G., The Organization of Information. 2nd ed. Westport, Conn.: Libraries Unlimited, 2004.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.