Using structured data to create a semantically enhanced web

At SMX Advanced, speakers Cata Milos and Max Prin highlight the rise of structured data for Bing and Google – and the work being done to expand its practical use.

Chat with SearchBot

To me, one of the more exciting developments in the SEO world in recent years is the growth and development of semantic markup and related technologies.

It’s somewhat dizzying to think of the full potential of a true Semantic Web – that dream of a meaningfully interconnected network of information that has been floating on the margins of the web-as-it-is since the very beginning. But recent practical applications of semantic organization, from Knowledge Graph to Schema.org to the machine learning behind voice search, have together seemed to indicate some real momentum in that direction.

Perhaps we aren’t creating one all-encompassing Semantic Web; perhaps it’s something more like a gradually evolving Semantically Enhanced Web, where the old architecture is not so much replaced as augmented by the new. But it’s still pretty cool.

In this light, I’m glad to see that this year’s SMX events have consistently given airtime to developments in structured markup and other topics one might class as semantic in nature. I was glad to attend one of these sessions at SMX Advanced, “What’s New With Schema & Structured Data,” hosted by Chris Sherman and presented by Cata Milos, senior program manager at Microsoft, and Max Prin, head of technical SEO at Merkle.

News flash: Bing wants your structured data, too

It should come as no surprise that Bing has been consuming structured markup at a pace and in a manner that mirrors the approach taken by Google. After all, Microsoft is a member of the consortium, along with Google, Yahoo and Yandex, that gave rise to Schema.org in the first place. But given that a lot of SEOs probably spend a little less time thinking about Bing than Google, it’s worth being reminded that structured markup matters for Bing – and that, although the two engines treat such markup mostly in the same way, there are some notable differences.

Milos explained that Bing has undergone a transformation in recent years from thinking of the web in terms of HTML to thinking of the web visually. The HTML view was appropriate, he said, when the web was much more text-heavy, and where the bulk of the information communicated on web pages could be reasonably distilled down to its text content. But today, with the extensive use of CSS, JavaScript, and multimedia content like images and videos, information is communicated in a much more visual manner. In response to this change, Bing’s web indexing process now renders web pages so they can be examined visually, and no longer relies merely on the HTML code itself.

Milos had several recommendations for how to build web content so that Bing can properly understand it, all of them grounded in this basic sense of visual orientation. He recommended that we think of the way humans apply visual understanding to complex documents. Human readers are trained to look for important elements like title, author, text, and images, and are trained to ignore secondary content such as additional links, ads, site navigation, and social media buttons. Given these expectations, web pages should be built in such a way that primary content is clearly identifiable and secondary content is minimally distracting.

Milos recommended HTML5 markup as a great way to tag page content semantically, since HTML5 contains tags like <header>, <nav>, <article>, and <footer> that allow you to identify page content in a way that has intrinsic meaning for browsers, developers, search engines, and readers alike.

In fact, Milos noted that 45% of Bing’s top indexed documents contain HTML5 semantic tags, suggesting that Bing may currently be placing a greater level of trust in semantic tags than Google does.

Even basic HTML components such as paragraph and header tags should be used according to their semantic intent, according to Milos. Web developers should avoid carelessly using <div> or <span> tags in favor of <p> tags to mark paragraphs, and <h1> to <h6> tags should be used with correct descending levels, matching the emphasis and visual importance of the heading. Tags such as <table> and <list> should be used only when creating a table or a list, and not for page formatting or other purposes.

Using markup according to these recommendations, according to Milos, improves your chances for content to appear in Bing featured snippets in the “zero position” in search, or in rich snippets appearing below the link in particular search results. Especially impressive in the Bing SERPs are rich results where Bing uses section headings in combination with elements like list markup to create two-level structured content beneath the primary search link, as shown in the screenshot below.

Image1 5
Bing rich snippets with hierarchical usage of semantic markup

As for other types of subject-specific semantic markup, Milos noted that Bing understands all common formats including Schema.org, RDFa, and OpenGraph, but tends to prefer Schema.org markup in either JSON-LD or Microdata format, with a slight preference toward the increasingly popular JSON-LD version.

Schema.org markup is apparent in Bing search results displaying such elements as film ratings and credits; recipe ratings, categories, and cooking times; and article authorship. Notably, Google stopped referencing author tags several years ago, but Bing continues to use them.

Thinking outside the box

Max Prin, in his section of the presentation, encouraged the audience to think beyond traditional SEO goals when assessing the use cases and success metrics for structured markup. For instance, he noted that it can be alarming for some SEOs to realize that winning placement in featured snippets can mean a decline in page views, click-throughs and ad sales. After all, such “zero-click” results encourage users to stay on the search page by providing an answer to their question and removing the need to visit any web page.

But Prin suggested that the goal (for many sites) should be conversions, not merely page views or CTR, and that featured snippets can lead to conversion by other paths, such as creating top-of-funnel awareness and trust in a brand. He cited the example of Sixt, a car rental company that successfully targeted featured snippet placement with the goal of increasing car rentals and rides, not page views and CTR.

Moreover, rich snippets that augment search results do in fact generally correlate to higher CTR, a fact that is more easily measurable now due to the addition of new “Rich results” and “Q&A rich results” filters in Google Search Console. According to Merkle’s analysis, most websites employing structured markup will see greater exposure in search than sites that do not use structured markup. Furthermore, higher ratings in rich snippets tend to improve CTR significantly, while other elements can improve or hurt CTR depending on the context. Price, for example, may improve CTR for cheaper items but decrease CTR for more expensive items.

Prin noted that although Schema.org defines 600 or so schema types, Google supposedly only indexes 30 of them. He quoted a developer who, in a Twitter post, expressed a common attitude: “We don’t have the dev resources to do anything that isn’t supported by Google.”

But Prin suggested that such thinking may be short-sighted. Google’s Gary Ilyes, for instance, has said that Schema, in general, helps Google “understand the content on the page,” and Google’s John Mueller has said that “Schema can help us extract entities better.” In short, Google may be making broader use of Schema types than what is apparent through evidence like rich snippets.

Regardless of the meaning behind these vague clues, I would say that broader use of Schema types could also help to future-proof your content since Google’s appetite for structured data only appears to be expanding.

Indeed, as an example of the expansion of use cases and applications for semantic markup, Prin mentioned Google Assistant voice search results, noting that we know featured snippets are being used in voice results today, though there isn’t any data on how frequently it’s happening. He also mentioned that “speakable” markup is now being beta tested by Google; such markup would indicate to a voice interface that content is intended to be spoken aloud.

As an aside, I’d be curious whether “speakable” markup opens up the possibility for publishing variants of textual content, such as a more detailed version intended for reading and a simplified, more concise version designed for voice. If it were to become popular to create such variants, an interesting side effect would be that many complex web pages would include summaries of their content, which could be useful for other purposes.

Finally, Prin offered some ideas for how semantic markup might be used for broader analysis of business impacts. Because semantic markup gives you the secondary benefit of organizing your content according to meaningful tags such as price or publish date, you can group content by these tags and correlate with other data points, in order, for instance, to determine the point at which views drop off for older content, or to examine correlations between price and CTR.

The clear takeaway from the session is that structured data will continue to grow in importance as Bing and Google work to expand its practical use. According to both presenters, it pays to be both aggressive and creative in your use of structured markup, moving a little beyond the boundaries of typical use cases and coding with an eye toward emergent technology, particularly voice search. In all of this, the views of the presenters conformed encouragingly with the notion that a Semantically Enhanced Web is gradually coming into being.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Damian Rollison
Contributor
Damian Rollison is VP of Product Strategy at Brandify, a leading local search solution provider specializing in multilocation brands. Damian has more than ten years of experience in SEO, reputation management, and listings management, having previously served as product lead at UBL and Moon Valley Software. Damian writes a regular column at Street Fight covering various topics in local.

Get the must-read newsletter for search marketers.