Future SEO: Linked Open Data (LOD)

As mentioned in my column on string entity optimization, the use of structured data allows search engines like Google to understand your page content so it can display better search results, or answers, to user queries.

This month, I’ll focus on Linked Open Data (LOD), which will allow you to publish structured data so it can be interlinked to establish relationships. This is important, as the relationship between words allows a clear understanding of site content by search bots.

In my column about understanding entity search, I explained how semantic search uses an ontology (or language) like microdata, RDFa, etc., to break down a sentence into its subject, predicate and object to show the relationship between the words in your content.

Linked Open Data builds on standard Web technology such as HTTP, RDF, URLs, etc., extending them so they can be can be read automatically by computers. That’s why it’s important for SEOs to understand and use LOD when applying structured data to content — to make it easier for machines to read that content.

Sentences More Important Than Keywords

LOD is used to leverage “sentences” in the digital realm as we do in everyday life. Optimizing for semantic search using LOD is about using a digital rendition of natural language sentence structure as the basis for describing things (content). SEOs need to look toward the use of sentences rather than keywords in order to enhance content published on the Web or on Intranets.

It appears “future SEO” will require a more technical background. Most SEOs, myself included, will have to collaborate with the semantic Web community to iron out the details. This isn’t some new optimization tactic for SEOs to cut-and-paste into their client pages; but it’s the very fabric of the Web and will require your time, energy, study and perseverance to work through it.

To explain what Linked Data consists of in simple terms, Tim Berners-Lee has defined the following LOD principles.

LOD Principles

In his Design Issues: Linked Data, Berners-Lee provides four principles of linked data (paraphrased below):

  1. Use URIs (Uniform Resource Identifiers) to indicate things
  2. Use HTTP URIs so things can be referred to and found by people or software on behalf of people
  3. When looking up a URI (thing), provide useful information leveraging standards such as RDF (Resource Description Framework) or SPARQL (an RDF query language)
  4. Include links to other related things (URIs) when publishing data on the Web so they can discover other things

To help explain more about what LOD is and how you can use it, I’d like to share a recent interview with Kingsley Idehen, founder & CEO of OpenLink Software. Kingsley is an industry-acclaimed technology innovator and provider of technology that exploits LOD across the enterprise and World Wide Web.

What Is Linked Open Data (LOD)?

Paul: Kingsley, can you give us an idea of what LOD is?

Kingsley: Linked Open Data is structured data representation enhanced through the use of HTTP URIs (links). Basically, it’s about entity relationship — model-based structured data representation where entities, attributes, and attribute values are denoted (“referred to”) by links.

linked data url

Hash based HTTP URI denotation illustrated

HTTP URIs are implicitly open in that translating what they denote is a function of the HTTP protocol as opposed to proprietary protocols scoped to a specific application or platform.

Can you give us an example?

The following statement:

Paris is the capital of France.

Expresses a relationship represented using natural language notation whereby all participants are denoted literally using words:

“Paris” “capital” “France”

And each plays a specific role, i.e., “Paris” is the Subject, “capital,” the Predicate and “France,” the Object.

Courtesy of Linked Data, the statement above could be enhanced by the use of reference (as opposed to literal) identifiers to denote the entities in the roles of: subject, predicate and object.

<#Paris> <#capital> <#France>

If I copy the statements above to a document and then make the document available to users on an HTTP network, I would end up with a document that would automatically demonstrate Linked Data due to the fact that I would have a collection of links presented in my browser that enable me explore the entity relationship represented by the link-enhanced statement. Semantically, my single statement document implies:

 

<> <#type> <#Document> .

<> <#mentions> <#Paris> .

<> <#mentions> <#Capital> .

<> <#mentions> <#France> .

<#Paris> <#capital> <#France> .

 

Note: “<>” is simply shorthand that implies the HTTP URL of the document is to be used as the HTTP URI that denotes the subject in the statement above. Basically, you have a description of a document that includes descriptions of other things. No different to this interview, so to speak.

The use of the phrases HTTP URI and HTTP URL can be confusing, so it’s best to look at how they are applied to entity denotation as follows:

  • HTTP URIs denote (“refer to” or name) anything
  • HTTP URLs (a kind of HTTP URI) denotes Web Documents
  • WebIDs (a kind of HTTP URI) denotes Agents (People, Organizations, Software, Machines, and anything else capable of mechanized operation)

What Is The Linked Open Data (LOD) Cloud?

I’ve heard the LOD Cloud is a massive big-data collective comprised of datasets from a variety of domains such as: general knowledge (Wikipedia), Life Sciences (Bio2RDF), Media (BBC), Government (Data.Gov and Data.Gov.UK) and many others. Can you explain the LOD Cloud in a little more detail for us?

This massive collection of data is an enclave on the Web where all the structured data, in the published datasets, is represented and then published inline with Linked Data principles, i.e., HTTP URIs are used to denote things, because doing so makes the structured data webby (or web-like). In a nutshell, data becomes as navigable and discoverable as anything else on an HTTP network (e.g., the World Wide Web).

Using my earlier example, I can leverage the massive LOD cloud as a powerful source of identifiers that denote a broad range of things. For instance, I can cross-reference entity identifiers in my basic examples with identifiers from the LOD Cloud, as follows:

 

<> <#type> <#Document> .

<> <#mentions> <#Paris> .

<> <#mentions> <#Capital> .

<> <#mentions> <#France> .

<#Paris> <#capital> <#France> .

<#Paris> <#sameAs> <http://dbpedia.org/resource/Paris> .

<#France> <#sameAs> <http://dbpedia.org/resource/France> .

 

Example, placing the statements above in a document published to an HTTP network expands on the basic demonstration of what Linked Open Data accords. As you can see, my link traversal is no longer confined to my document; I’ve made a reference to data within DBpedia, which as a major junction-box in the LOD Cloud could send me (or agents) anywhere.

What’s The Difference Between Linked Data And Linked Open Data?

Are Linked Data and Linked Open Data the same thing?

Not really. The linkage comes from the structure of an entity model based statement (a kind of sentence). The openness comes from the use of a standard for entity denotation in the form of HTTP URIs. Do note, it is possible to make entity relationship model statement collections that provide a structured representation of data using many kinds of identifiers; the magic of HTTP URIs as denotation mechanisms lies in the underlying openess of URIs and the HTTP protocol.

You can have Linked Data that isn’t “Open,” through the use of proprietary identifiers for entity denotation. In short, this is how we’ve all worked with computer programs for years, prior to the emergence of URIs and the HTTP protocol. Even RDF (which mandates the use of URIs and is often conflated with Linked Data), can be used to produce Linked Data that isn’t actually “Linked Open Data.”

The diagram that follows goes a long way toward dispelling some of the confusion that swirls around Linked Data and RDF; by reminding everyone of the *fact* that Linked Data was at the very core of the Web’s original design. Recently, I tweaked Tim Berners-Lee’s original proposal document; by using HTTP URIs as opposed to Strings to denote the nodes (subjects or objects) and connectors (predicates) in the network diagram (or graph) that depicts his original World Wide Web proposal.

CERN

How Does LOD Benefit A Publisher (E-Commerce Provider)?

With Search Engines in mind and using e-commerce as an example, can you explain the benefits to us?

It increases the Serendipitous Discovery Quotient (SDQ) of content. By that I mean: it increases the degree to which content is discovered in a manner that’s “pleasantly surprising” to users with regards to relevance.

What is Serendipitous Discovery Quotient (SDQ)?

SDQ is a metric for understanding the effects of enhancing structured data representation via HTTP URIs. Golliher wrote a good article a while back  (in reaction to his first encounter with acronym) titled, Serendipitous Discovery Quotient (SDQ): The Future of SEO? Or an Abstract Concept?

IQ is a metric associated with human intelligence. SDQ is a metric associated with Web intelligence.

How can e-commerce benefit?

E-commerce vendors can actually focus on what actually comes naturally to them, i.e., producing fine-grained descriptions of their products and services, knowing that description clarity is ultimately always the critical factor for discoverability that leads to customer growth and retention.

This fundamentally implies that the description of entities such as offers, products,  pricing, availability, opening and closing hours etc., become the focal point of Web content strategy, much more so than site aesthetics and old-school keyword-based SEO hacks.

What About Schema.org Types?

Do Schema.org semantic markup, entities and LOD relate to each other in some way?

Very much so! In schema.org, you have a powerful bridge for publishing structured data that simplifies integration with the LOD cloud. From the LOD cloud side of things, you already have schema.org cross-references in datasets such as DBpedia and many others. It’s all happened in a very natural way, rather than through any kind of brute force.

Today, many online retailers are already publishing structured data based on terms from Schema.org, and in doing so they are enhancing discoverability across three critical frontiers:

  • Search Engines
  • Social Media
  • LOD Cloud

How Are Hashtags & Linked Data Related To SEO?

Barbara Starr talks about the relationship between Hashtags, Linked Data and SEO; can you elaborate a little more on this?

Hashtags solve a problem that’s long challenged HTTP URIs, i.e., the unwieldy aesthetic nature of long URIs. Through the use of hashtags, the Web user community has used folksonomy-oriented patterns to device a shorthand pattern for HTTP URIs.

Thus, courtesy of *hashtag* adoption across social media service providers, you can perform the act of HTTP URI based denotation through the practice of hash tagging. Just like that! Everyone is annotating the Web in a manner that adds more semantics to the connections between the entities denoted by these tags.

Action Items & Take Aways

What action items do you recommend for SEOs to be a part of LOD?

Get to understand that LOD isn’t some scary mysterious thing that’s a specialization for a chosen few. Instead, look at how tagging (using hashtags) is altering the way people post or track content across social media. Just click on a hashtag on G+ or Twitter for instance, and you will immediately realize that each hashtag is really a super key that resolves to a “Topic Web” comprised of contextualized links to related posts, images, music and videos, etc.

All you have to do is describe things using simple subject->predicate->object statements or simply tag posts using hashtags.

Thanks for sharing your insight with us, Kingsley. As always, change is on the horizon, and understanding semantic markup and linked open data is becoming more and more a best practice for SEOs.

Linked Open Data Advantages

Below are some key reasons why SEO practitioners must take note of the information above on Linked Open Data:

  1. Openness: This means moving away from optimization for each search engine and their periodic changes of ranking algorithms; this is optimization for the Web as whole.
  2. Cost-effectiveness: The longevity of SEO is based on entity descriptions, oriented documents that are inherently search engine agnostic.
  3. Discoverability: This increases serendipitous discovery by focusing on entity description granularity.

For more elaboration on the LOD Cloud, I may collaborate with Kingsley in a future column. In the meantime, see the LOD resources below for more information.

Semantic SEO Resources

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: All Things SEO Column | Channel: SEO | Schema.org | SEO: General

Sponsored


About The Author: is Managing Partner at PB Communications LLC. Specializing in SaaS solutions for Enterprise Store Locator/Finders, Semantic/Organic/Local/Mobile and SEO Diagnostic Audits for increasing online and in-store foot traffic.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://websitecash.net/ Scott McKirahan

    This is insane. If I am reading this correctly, you are suggesting that eCommerce websites tag words on their pages? Talk about an absolutely horrible user experience! It never ceases to amaze me how much website owners are being forced to change things because of the completely inadequate search engine algorithms. It should be the other way around, you know!

  • https://plus.google.com/112399767740508618350/about kidehen

    @ScottyMack:disqus — note how disqus automagically auto completed “@ScottyMack” when you made your comments. Likewise, notice how using #SEO on G+, Twitter, or Facebook will get you to all topics associated with the hashtag? These are all examples of the effects of tagging content published on the Web.

    The pattern: “@ScottyMack” is a shorthand for an HTTP URI from disqus that denotes “You” .

    The pattern: #SEO is a shorthand for an HTTP URI that denotes the topic “SEO”.

    This post is simply about the fact that “ScottyMack” (literal denotation of “You”) is no longer optimal when referring to “You” on the Web . The same applies to topics, as in “SEO” being suboptimal relative to #SEO .

    Google refers to this as: Things instead of Strings, but that catchy moniker is misleading becuase the real issue at hand is the evolution of denotation (naming) away from Strings Identifiers to Reference Identifiers — such as HTTP URIs.

    “@” and “#” are just shorthands for HTTP URIs that denote agents and topics respectively :-)

  • https://plus.google.com/112399767740508618350/about kidehen

    @ScottyMack:disqus — note how disqus automagically auto completed “@ScottyMack” when you made your comments. Likewise, notice how using #SEO on G+, Twitter, or Facebook will get you to all topics associated with the hashtag? These are all examples of the effects of tagging content published on the Web.

    The pattern: “@ScottyMack” is a shorthand for an HTTP URI from disqus that denotes “You” .

    The pattern: #SEO is a shorthand for an HTTP URI that denotes the topic “SEO”.

    This post is simply about the fact that “ScottyMack” (literal denotation of “You”) is no longer optimal when referring to “You” on the Web . The same applies to topics, as in “SEO” being suboptimal relative to #SEO .

    Google refers to this as: Things instead of Strings, but that catchy moniker is misleading becuase the real issue at hand is the evolution of denotation (naming) away from Strings Identifiers to Reference Identifiers — such as HTTP URIs.

    “@” and “#” are just shorthands for HTTP URIs that denote agents and topics respectively :-)

  • https://plus.google.com/112399767740508618350/about kidehen

    @ScottyMack:disqus — note how disqus automagically auto completed “@ScottyMack” when you made your comments. Likewise, notice how using #SEO on G+, Twitter, or Facebook will get you to all topics associated with the hashtag? These are all examples of the effects of tagging content published on the Web.

    The pattern: “@ScottyMack” is a shorthand for an HTTP URI from disqus that denotes “You” .

    The pattern: #SEO is a shorthand for an HTTP URI that denotes the topic “SEO”.

    This post is simply about the fact that “ScottyMack” (literal denotation of “You”) is no longer optimal when referring to “You” on the Web . The same applies to topics, as in “SEO” being suboptimal relative to #SEO .

    Google refers to this as: Things instead of Strings, but that catchy moniker is misleading becuase the real issue at hand is the evolution of denotation (naming) away from Strings Identifiers to Reference Identifiers — such as HTTP URIs.

    “@” and “#” are just shorthands for HTTP URIs that denote agents and topics respectively :-)

  • http://www.alittlebranding.com/ Bob Strassel Jr.

    Paul, great post! So much great information.I have 2 questions: Do you think that companies who implement LOD’s will have a first mover advantage over competitors?And how do you think that this “scales” for small businesses who may not have the resources to implement? Thanks again!

  • http://www.alittlebranding.com/ Bob Strassel Jr.

    Paul, great post! So much great information.I have 2 questions: Do you think that companies who implement LOD’s will have a first mover advantage over competitors?And how do you think that this “scales” for small businesses who may not have the resources to implement? Thanks again!

  • Addam Hassan

    This is one of the best or should I say most exciting articles I’ve read. Could you recommend any resources that you’ve seen that would help e-commerce sites outside the traditional Google and Schema.org that would provide a good walk through. I recently came across this http://www.iacquire.com/blog/18-meta-tags-every-webpage-should-have-in-2013/ does anyone else have any further recommendations?

  • Addam Hassan

    This is one of the best or should I say most exciting articles I’ve read. Could you recommend any resources that you’ve seen that would help e-commerce sites outside the traditional Google and Schema.org that would provide a good walk through. I recently came across this http://www.iacquire.com/blog/18-meta-tags-every-webpage-should-have-in-2013/ does anyone else have any further recommendations?

  • http://www.paulbruemmer.com/ Paul Bruemmer

    Bob, it is logical those participating in
    LOD cloud will be better poised and prepared for the growth when it
    happens. Existing and new technologies will determine new ways of
    exploiting LOD cloud to the benefit of everyone visiting the Web. Intuition tells me this will scale to connect and empower all
    SMBs as a group e.g., united we stand, divided we fall. Semantic strategists and technologists will figure out ways to leverage LOD and SMBs via existing and new devices. Have fun with it!

  • emekaokoye

    Excellent post, Paul. I suggest you update your 4th paragraph:
    ” That’s why it’s important for SEOs to understand and use LOD when applying structured data to content — to make it easier for machines to read that content. ”
    to something like
    ” That’s why it’s important for SEOs to understand and use LOD when applying structured data to content — to make it easier for machines to read that content as well as for humans to discover and understand it too. ”
    Remember the SEOs main objective is to improve discoverability by both machines and humans. Machine-only comprehension of content does not sound right as a major factor for SEOs in adopting the Semantic Web.
    Thoughts?

  • http://www.alittlebranding.com/ Bob Strassel Jr.

    Thanks Paul, let’s hope that happens. Is there anyone out there working with SMB’s specifically to enable this now? Or do you think there is time and we are in the early adoption phase?

  • http://www.metapilot.com/ Metapilot

    I’m excited by the granularity in product descriptions and the richness of meanings LOD enables. Search engines today are so clumsy when it comes to distinguishing between similar objects (for example, intricate widgets that are identical except for a single characteristic). When a site contains a lot of those, it’s likely to get lumped into the spam bucket as “duplicate” or “thin” content. Of course, that has as much to do with the language we (marketers) use as it does with the search engines ability to draw distinctions from it.

    I might assert that your list of three linked open data advantages could be construed as a bit #naive, however. As far as “openness” goes, future “search engines” will still have to employ an algorithm to prioritize a list of search results and that algorithm will always have biases of some sort–leaving open opportunities for exploitation. And even though rudimentary, the primary descriptive tools of today’s SEO–words–are certainly themselves search engine agnostic. I don’t foresee structured content eliminating all opportunities for the marketer to figure out how to use it in ways that will give them an advantage over a competitor in algorithmic search results. I think we’ll also see that greater granularity will beget greater competition, making discoverability in the LOD world not that much different than it is in today’s world.

    Thanks for your article. It makes me believe that there is still a lot to look forward to in the world of search.

  • http://www.metapilot.com/ Metapilot

    I’m excited by the granularity in product descriptions and the richness of meanings LOD enables. Search engines today are so clumsy when it comes to distinguishing between similar objects (for example, intricate widgets that are identical except for a single characteristic). When a site contains a lot of those, it’s likely to get lumped into the spam bucket as “duplicate” or “thin” content. Of course, that has as much to do with the language we (marketers) use as it does with the search engines ability to draw distinctions from it.

    I might assert that your list of three linked open data advantages could be construed as a bit #naive, however. As far as “openness” goes, future “search engines” will still have to employ an algorithm to prioritize a list of search results and that algorithm will always have biases of some sort–leaving open opportunities for exploitation. And even though rudimentary, the primary descriptive tools of today’s SEO–words–are certainly themselves search engine agnostic. I don’t foresee structured content eliminating all opportunities for the marketer to figure out how to use it in ways that will give them an advantage over a competitor in algorithmic search results. I think we’ll also see that greater granularity will beget greater competition, making discoverability in the LOD world not that much different than it is in today’s world.

    Thanks for your article. It makes me believe that there is still a lot to look forward to in the world of search.

  • http://www.metapilot.com/ Metapilot

    I’m excited by the granularity in product descriptions and the richness of meanings LOD enables. Search engines today are so clumsy when it comes to distinguishing between similar objects (for example, intricate widgets that are identical except for a single characteristic). When a site contains a lot of those, it’s likely to get lumped into the spam bucket as “duplicate” or “thin” content. Of course, that has as much to do with the language we (marketers) use as it does with the search engines ability to draw distinctions from it.

    I might assert that your list of three linked open data advantages could be construed as a bit #naive, however. As far as “openness” goes, future “search engines” will still have to employ an algorithm to prioritize a list of search results and that algorithm will always have biases of some sort–leaving open opportunities for exploitation. And even though rudimentary, the primary descriptive tools of today’s SEO–words–are certainly themselves search engine agnostic. I don’t foresee structured content eliminating all opportunities for the marketer to figure out how to use it in ways that will give them an advantage over a competitor in algorithmic search results. I think we’ll also see that greater granularity will beget greater competition, making discoverability in the LOD world not that much different than it is in today’s world.

    Thanks for your article. It makes me believe that there is still a lot to look forward to in the world of search.

  • Vipin Kumar

    I have to say this, great information there @pbruemmer:disqus.

    Its amazing how semantic web can help build a (almost) perfect world. Also, I love the fact that LOD will enhance the quality of web search by giving the searchers serendipitous results which may be significant for them but they didn’t know about it, eventually, what we have got is, a searcher with complete information, which is a GREAT thing, considering the fact that “half information is dangerous” :-)

    I am sure, semantic web will make people quite fully informed, which is great for every one on this planet. I guess i am not going overboard, but just trying to think of the implications of semantic web. it looks great from here!!

  • Vipin Kumar

    I have to say this, great information there @pbruemmer:disqus.

    Its amazing how semantic web can help build a (almost) perfect world. Also, I love the fact that LOD will enhance the quality of web search by giving the searchers serendipitous results which may be significant for them but they didn’t know about it, eventually, what we have got is, a searcher with complete information, which is a GREAT thing, considering the fact that “half information is dangerous” :-)

    I am sure, semantic web will make people quite fully informed, which is great for every one on this planet. I guess i am not going overboard, but just trying to think of the implications of semantic web. it looks great from here!!

  • Jesse Greer

    Fantastic article, this should be THE primer for everyone interested in linked data. Bravo.

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide