Everything you need to know about SEO, delivered every Thursday.
What is quality content?
Columnist Patrick Stox takes a comprehensive look at what Google might consider to be "quality content" and adds his own thoughts and tips based on his experience in the SEO industry.
We’ve all heard that content is king and that you need to write high-quality content, or now “10x content,” as coined by Rand Fishkin. Ask SEOs what “quality content” is and you’ll receive a lot of varied and opinionated answers. Quality is subjective, and each person views it differently.
Ask SEOs what Google considers to be quality content, and you will get a lot of blank stares. I know because I like to ask this a lot.
The number one answer I get, sadly, is that content should be x number of words, where x is usually 200, 300, 500, 700, 1,000, 1,500, or 2,000. More content does not mean better content. A simple query about the age of an actor can be fully answered in a sentence and doesn’t require their life story and filmography.
Another answer I receive is that the content should be “relevant.” The problem with this is that low-quality pages can be relevant as well.
Other SEOs I’ve asked have given amazingly detailed answers from patents or ideas from machine learning about word2vec, RankBrain, deep learning, count-based methods and predictive methods.
Is there a right answer?
Google Webmaster Quality Guidelines
Google has quality guidelines here. However, you may notice that there are many guidelines around negative signals but few around positive signals. When reading these, think for a minute what happens when two, ten or a hundred websites aren’t doing anything bad. How do you determine the quality difference if no one does anything wrong?
- Make pages primarily for users, not for search engines.
- Don’t deceive your users.
- Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you’d feel comfortable explaining what you’ve done to a website that competes with you, or to a Google employee. Another useful test is to ask, “Does this help my users? Would I do this if search engines didn’t exist?”
- Think about what makes your website unique, valuable or engaging. Make your website stand out from others in your field.
Avoid the following techniques:
- Automatically generated content
- Participating in link schemes
- Creating pages with little or no original content
- Sneaky redirects
- Hidden text or links
- Doorway pages
- Scraped content
- Participating in affiliate programs without adding sufficient value
- Loading pages with irrelevant keywords
- Creating pages with malicious behavior, such as phishing or installing viruses, trojans or other badware
- Abusing rich snippets markup
- Sending automated queries to Google
Follow good practices like these:
- Monitoring your site for hacking and removing hacked content as soon as it appears
- Preventing and removing user-generated spam on your site
Google on how to create valuable content
Then there’s this section from Google’s Webmaster Academy course, which tells you how to “create valuable content.” There are a few good tips here on what to avoid: broken links, wrong information, grammar or spelling mistakes, excessive ads and so on. These are useful tips, but again, they focus on what not to do.
There are some tips on how to make your site useful, credible and engaging; however, when it comes to being more valuable or high-quality, Google basically says, “be more valuable or high-quality.”
As you begin creating content, make sure your website is:
Useful and informative: If you’re launching a site for a restaurant, you can include the location, hours of operation, contact information, menu and a blog to share upcoming events.
More valuable and useful than other sites: If you write about how to train a dog, make sure your article provides more value or a different perspective than the numerous articles on the web on dog training.
Credible: Show your site’s credibility by using original research, citations, links, reviews and testimonials. An author biography or testimonials from real customers can help boost your site’s trustworthiness and reputation.
High-quality: Your site’s content should be unique, specific and high-quality. It should not be mass-produced or outsourced on a large number of other sites. Keep in mind that your content should be created primarily to give visitors a good user experience, not to rank well in search engines.
Engaging: Bring color and life to your site by adding images of your products, your team or yourself. Make sure visitors are not distracted by spelling, stylistic and factual errors. An excessive number of ads can also be distracting for visitors. Engage visitors by interacting with them through regular updates, comment boxes or social media widgets.
Google’s Panda algorithm
Panda algorithmically assessed website quality. The algorithm targeted many signals of low-quality sites but again didn’t provide much in the way of useful information for positive signals.
Google’s Search Quality Rating Guidelines
There were a lot of signals for both high- and low-quality content and websites in the Google Search Quality Ratings Guidelines. It is worth reading in its entirety multiple times, but I pulled out some of the important parts here:
What makes a High-quality page? A High-quality page may have the following characteristics:
- High level of Expertise, Authoritativeness and Trustworthiness (E-A-T)
- A satisfying amount of high quality MC (Main Content)
- Satisfying website information and/or information about who is responsible for the website, or satisfying customer service information if the page is primarily for shopping or includes financial transactions
- Positive website reputation for a website that is responsible for the MC on the page
They expand further on the concept of E-A-T. This was the part of the guidelines I found the most interesting and relevant in determining quality of content (or a website in general).
6.1 Low Quality Main Content
One of the most important criteria in PQ (Page Quality) rating is the quality of the MC, which is determined by how much time, effort, expertise and talent/skill have gone into the creation of the page and also informs the E-A-T of the page.
Consider this example: Most students have to write papers for high school or college. Many students take shortcuts to save time and effort by doing one or more of the following:
- Buying papers online or getting someone else to write for them
- Making things up
- Writing quickly, with no drafts or editing
- Filling the report with large pictures or other distracting content
- Copying the entire report from an encyclopedia or paraphrasing content by changing words or sentence structure here and there
- Using commonly known facts, for example, “Argentina is a country. People live in Argentina. Argentina has borders.”
- Using a lot of words to communicate only basic ideas or facts, for example, “Pandas eat bamboo. Pandas eat a lot of bamboo. Bamboo is the best food for a Panda bear.”
I found the part of about large images amusing. I’m not a fan of hero images unless they are exceptional. Unfortunately, most end up being generic. Some publications make it worse and use generic hero sliders. Remember, there is an algorithm for “above-the-fold,” and I feel like hero images are completely against this. Most hero images provide little to no useful content without having to scroll.
In section 7.0, “Lowest Quality Pages,” Google notes that the following types of pages/websites should receive the Lowest quality rating:
- Harmful or malicious pages or websites
- True lack of purpose pages or websites
- Deceptive pages or websites
- Pages or websites which are created to make money with little to no attempt to help users
- Pages with extremely low or lowest-quality MC
- Pages on YMYL websites that are so lacking in website information that it feels untrustworthy
- Hacked, defaced or spammed pages
- Pages or websites created with no expertise or pages which are highly untrustworthy, unreliable, unauthoritative, inaccurate or misleading
- Websites which have extremely negative or malicious reputations
- Violations of the Google Webmaster Quality Guidelines
Speaking more specifically about page content in section 7.4, “Lowest Quality Main Content,” the guidelines note that the following types of Main Content (MC) should be judged as Lowest quality:
- No helpful MC at all or so little MC that the page effectively has no MC
- MC which consists almost entirely of “keyword stuffing”
- Gibberish or meaningless MC
- “Auto-generated” MC, created with little to no time, effort, expertise, manual curation or added value for users
- MC which consists almost entirely of content copied from another source with little time, effort, expertise, manual curation or added value for users.
Finally, in section 7.2, “Lack of Purpose Pages,” Google notes that:
Sometimes it is impossible to figure out the purpose of the page. Such pages serve no real purpose for users. For example, some pages are deliberately created with gibberish or meaningless (nonsense) text. No matter how they are created, true lack of purpose pages should be rated Lowest quality.
I love how these sections are all basically saying that your page needs to have a purpose and be understood. I’ve seen many marketing pages that use so much lingo, jargon or marketing-speak that even people at the company can’t tell you what the page is about. What’s worse is when good content is stripped away to make more of these kinds of pages.
There are also some interesting snippets regarding the different elements and signals of trust that might need to be included based on the type of website. This information is extremely important, and it’s easy to brainstorm the different website elements that a local business would need (such as “about us” or “contact”), compared to an e-commerce store that might need reviews, pricing and so forth.
The point is that you need to understand the questions your customers are asking and provide that information to them.
12.7 Understanding User Intent
It can be helpful to think of queries as having one or more of the following intents.
- Know query, some of which are Know Simple queries
- Do query, some of which are Device Action queries
- Website query, when the user is looking for a specific website or webpage
- Visit-in-person query, some of which are looking for a specific business or organization, some of which are looking for a category of businesses
The above is very similar to the standard “informational, navigational and transactional” system, but I like this better.
Google elaborates on the idea of matching user intent with the purpose of the page elsewhere in the document — section 2.2, “What is the Purpose of a Webpage?” lists the following common page purposes:
- To share information about a topic
- To share personal or social information
- To share pictures, videos or other forms of media
- To express an opinion or point of view
- To entertain
- To sell products or services
- To allow users to post questions for other users to answer
- To allow users to share files or to download software
Boom! Jackpot. Matching the user intent with the purpose of a page and type of content expected is exactly what I’m looking for in trying to determine quality.
This makes sense if you think about it from the standpoint of semantic search. If I’ve got a product page, and the top results for the keyword I’m targeting are all informational in nature, then I obviously need to either create an informational page or add more information to my product page if I even want to compete.
I see this mismatch often when people ask why they’re not ranking for a specific term.
Google’s guidance on building high-quality websites
Even before the Quality Raters Guidelines, way back in 2011, there was this gem on the Google Webmaster Central Blog that told us the questions Google engineers asked themselves when building the algorithm.
- Would you trust the information presented in this article?
- Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature?
- Does the site have duplicate, overlapping or redundant articles on the same or similar topics with slightly different keyword variations?
- Would you be comfortable giving your credit card information to this site?
- Does this article have spelling, stylistic or factual errors?
- Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
- Does the article provide original content or information, original reporting, original research or original analysis?
- Does the page provide substantial value when compared to other pages in search results?
- How much quality control is done on content?
- Does the article describe both sides of a story?
- Is the site a recognized authority on its topic?
- Is the content mass-produced by or outsourced to a large number of creators or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?
- Was the article edited well, or does it appear sloppy or hastily produced?
- For a health-related query, would you trust information from this site?
- Would you recognize this site as an authoritative source when mentioned by name?
- Does this article provide a complete or comprehensive description of the topic?
- Does this article contain insightful analysis or interesting information that is beyond the obvious?
- Is this the sort of page you’d want to bookmark, share with a friend or recommend?
- Does this article have an excessive amount of ads that distract from or interfere with the main content?
- Would you expect to see this article in a printed magazine, encyclopedia or book?
- Are the articles short, unsubstantial or otherwise lacking in helpful specifics?
- Are the pages produced with great care and attention to detail vs. less attention to detail?
- Would users complain when they see pages from this site?
Once again, spelling, factual errors and content quality control are mentioned, just like in the Google Search Quality Rating Guidelines. There are also a couple of questions about a site being recognized as an authority on the topic or an authority in general.
Additionally, there are questions that seek to answer if the person knows the topic well, if the content is unique and how comprehensively the topic is covered. This matches up perfectly with the E-A-T concept from the Search Quality Rating Guidelines.
Some content quality signals you can control
- Broken links. Crawl your site with a program like Screaming Frog and fix them.
- Wrong information. Do research and find the right sources.
- Grammatical mistakes. You can use a tool like Grammarly or have someone proofread your writing.
- Spelling mistakes. Use spell-check or an editor.
- Reading level. The Hemingway App is good for this. You should be adjusting your reading level based on your target audience and the intent of the query.
- Excessive ads. Just don’t.
- Page load speed. Go read this.
- Website features. The features you should have will change depending on the type of website and the intent of the query.
- Matching the user intent with the purpose of a page and type of content expected. Take a look at the search results to see what is already ranking.
- Authority and comprehensiveness. Keep reading.
There are things outside of your control in the short term, but you can play the long game and continue to build your authority over time by consistently creating comprehensive content.
At SMX West, I briefly showed a way of identifying all topics/subtopics in an industry and how to completely cover these based on keyword groupings. I believe that if you’re covering everything that’s being searched for and answering every question that people are asking about a topic, then you have a complete answer, and it will be the best answer for a search engine to return in the results.
How do I determine quality content?
I want to share a little more about my actual process and what I look for on a page, or a section of the site as it relates to the content of the page. Besides technical on-page elements, in the content itself what I’m usually looking for are:
- Concepts and entities
- Co-occurrence of keywords/phrases
- Topical completeness
Concepts and entities
We know that Google looks for concepts and entities in the content, so I usually start here. I use Alchemy API for this.
If I enter the page from Google about creating valuable content — https://support.google.com/webmasters/answer/6001093?hl=en — I get back some information on entities such as Search Console, search engines, Google and social media. Concepts returned are for website, Google search, PageRank, web search engine, Bing and Google. Keyword relevance is also returned through Alchemy:
If you run many of the top ranking websites for a search query through Alchemy API, you will find a lot of overlap that indicates useful data. There are likely consistent concepts and entities that you would want to include in the body of your text. Alchemy has a JSON output, and I know a lot of people use Blockspring to pull into Google Sheets.
Co-occurrence of keywords and phrases
Ultimate Keyword Hunter provides words or phrases that are used on the pages the most. I normally sort by co-occurrence across websites and find that usually two-, three- and four-keyword phrases are the most useful. I set this to pull data from the top 50 search results.
Moz’s new Keyword Explorer has an interesting filter, “related to keywords with similar results pages,” that looks at pages that rank highly for the query entered and looks for other searches that contain the same pages. For example, a quick glance shows me that the pages ranking for “quality content” also rank for different terms around blogs, websites, content marketing and content strategy — all of which I may want to include on my page.
I like to pull all auto-suggested keywords around a topic with Keyword Sh**ter (terrible name, but it’s very useful) and then put the resulting terms back into the AdWords Keyword Planner, which groups them. These groups are the main ideas I want to cover around a topic, whether all on the same page in subsections or on their own pages.
You can see the pivot table I created for auto-suggested terms based on “content quality” here. On a side note, I almost always put the original topic into the Keyword Planner as well, and I will often stem off the original topic into other topics based on the results.
Another tool I like is Answer The Public, which I first heard about from Wil Reynolds. Remember to change the country if you’re not from the UK. The tool is scraping auto-suggested terms and displaying them nicely in a grouped fashion by questions people are asking.
These create the silo of pages I need around each topic to really make sure I’m covering it in-depth, providing answers to all questions being asked and catching people in every part of their journey. I like to think it makes a website the best answer. The more of these are covered, the more expertise and authority you and your website are building around a topic.
It all really starts with the query intent. Then it’s matching your information and your website to the kinds of information that someone would need to be a good result for them.
This is the data I use to determine what I need to include in my content for completeness and relevancy. I like to inject my own expertise and opinions into the content as well — after all, it’s important to know what has been said, but it’s more important to add insights into things that might not have been said.
I know everyone has their own processes and ways of doing things, and I would love to hear from some of you about how you approach quality content. Let me know what you look for, what tools you use or what your process is for determining quality of content.
Some opinions expressed in this article may be those of a guest author and not necessarily Search Engine Land. Staff authors are listed here.