Google’s Hunger For Structured Markup

Google is keen for structured markup — to put it mildly. In the not-too-distant past, I wrote about Google’s Data Highlighter for event data, a tool which allows webmasters to indicate structured data for events without having to actually mark up the site’s HTML code. It has the charming feature that the resultant extracted data is […]

Chat with SearchBot

Google is keen for structured markup — to put it mildly.

In the not-too-distant past, I wrote about Google’s Data Highlighter for event data, a tool which allows webmasters to indicate structured data for events without having to actually mark up the site’s HTML code. It has the charming feature that the resultant extracted data is viewable by the webmaster only in Webmaster Tools in the Structured Data section; and, of course, the data is available to Google itself.

As there is no actual structured markup ever placed on the page (i.e., no schema.org, microdata or any other markup), the information is extracted via human-guided machine learning and possibly other techniques. This extracted/consumed information resides internally in Google and is not available to any other search or social engine for consumption. Thus, my question at the time was, “Is Google Hijacking Structured Markup?” As you read on, you will certainly realize the answer is affirmative.

Hunger For Structured Markup

Google Structured Data Markup Helper

While perusing the new types of structured data supported by the Data Highlighter, I came across a far more interesting tool — a means Google is giving webmasters to add structured markup to their sites.

Google Sructured Data Markup Helper

The Google Structured Data Markup Helper is actually a pretty cool tool. It allows you to enter a URL and then highlight on-page elements for which you would like to generate structured data markup, automatically mapping them into the appropriate schema.org vocabulary with guided direction as to the relevancy of that element in the schema.org ontology. To test it out and illustrate how it works, I used this product page as an example.

To start, I selected “Products” from the options above, entered the product page URL, and clicked “Start Tagging.” This brought up the screen below: the schema for “products” and its associated data items appeared on the right-hand side of the screen, and the webpage itself appeared on the left.

Google Structured Markup Helper Product Ex

In this environment, you can highlight any page element — when you do, a drop-down menu appears from which you can select an identifier (Product Name, Product Image, Price, Brand Name, etc.) from among available schema.org markup. Once selected, this information populates within the “My Data Items” pane on the right. On my example page, you can see in the screenshot below that I indicated the brand name (“Rolodex”) and the price (“$21.90”).

Brand Added Price Date Verified

(Also of particular note for you data quality folks out there (and I am presuming this is anyone who is involved in Google Shopping at a minimum): note  that the date on which the price is verified above is recorded.)

After tagging all the page elements you’d like to annotate, click on the “Create HTML” button in the upper right-hand corner. This generates a new version of the source code for the page, with the added microdata markup (highlighted for your convenience). All you need to do is add the highlighted HTML markup to your page as shown. Very useful and elegant — and, unlike with the Data Highlighter tool, you actually do get the schema.org markup physically on your page (and thus viewable by other search engines, your Chrome plugins, etc.).

Screen HTML Source Microdata Markup

Another point worth observing is that Google gives you a choice of two formats. “Microdata” is selected by default, and “JSON-LD” is provided as an alternative option.

JsonLD microdata supported

I was pleasantly surprised to see this, as I find it JSON-LD a far more elegant solution (see the JSON-LD code displayed below).

Structured Data as JsonLD markup

(For the record, Google does state that it prefers microdata for web content.)

As a final note, the Markup Helper supports a range of schema.org markup, but not all of schema.org’s data types. The types supported can be seen in the figure below, and more information can be found here.

Data Types Supported by Markup Helper

Changes to the Google Data Highlighter

The Google Data Highlighter, found under the “Optimization” section of Google Webmaster Tools, was recently extended to support more than just events. As you can see, the new data types supported by the Data Highlighter are identical to those supported by the Structured Markup Helper.

Types Data Supported by Google Data Highlighter

There are clearly many other tools on the market that enable webmasters/users to generate structured markup for their webpages. However, the fact that Google has released two different tools to do this makes it clear that it intends to lead the charge (or at least be at the forefront) of the proliferation of structured data markup on the Web.

Google is definitely “hijacking structured markup” using the Data Highlighter, since this information is not consumable by the standard semantic Web community tools; so, its official support of JSON-LD within the Structured Data Markup Helper (an announcement of formal support to the semantic Web community) while simultaneously expanding Data Highlighter functionality is an interesting juxtaposition of events.

Key Takeaways

  • Be sure to place semantic markup on your pages — search engines will continue to leverage this information to enhance SERPs, presumably in ways that searchers will find useful.
  • Try to keep current with the latest supported microdata formats and schema.org markup (as well as other vocabularies supported by Google, such as GoodRelations).
  • There are many tools on the market to generate static annotations using schema.org and microdata.
  • Be on the lookout for commercially available tools that can dynamically interpret HTML pages in real time and apply relevant semantic markup.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Barbara Starr
Contributor
Barbara Starr of SemanticFuse is a semantic strategist and software engineer, providing semantic SEO and other related consulting services. Starr is a technology expert and software designer specifically in the semantic search and semantic Web arenas. She worked as Principal Investigator for SAIC on the ARDA ACQUAINT program, which was the genesis for Watson at IBM. She also worked on the DARPA HPKB program, which was one of the precursors to the Semantic Web. She is the founder of the Semantic Web Meetup in San Diego, CA, as well as several other meetup groups. She is a governing board member of the Semantic Computing Consortium and is industry chair for IEEE International Conference on Semantic Computing.

Get the must-read newsletter for search marketers.