Are PDFs Optimal For SEO? The Pros And Cons

PDF_LogoI expect that most everyone working in SEO knows that PDFs are indexable by search engines. PDFs can also appear with an authorship-rich snippet in Google SERPs. But, just because a file format can be indexed doesn’t always mean that it’s the ideal approach. Today, I’d like to explore the pros and cons of PDFs from an SEO perspective.

The Pros Of Using PDFs

There are some pros to using PDFs. Besides ease of use, they can help with indexing because these documents contain meta data, links, indexable content and authorship attributes.

1. Easy to Create

PDFs can be very helpful for marketers, especially those with smaller teams or limited resources. They’re easy to create — just save your document from Word, Illustrator, etc., as a PDF. Press releases, case studies, product data sheets and more can quickly be converted to an essentially web-ready format. For those without any HTML programming knowledge, PDFs for certain document types can be a fast way to publish web-based content.

2. Contain Meta Data

PDFs also contain meta data, such as meta keywords and descriptions. You can find and edit the meta information under Properties in the File menu in Adobe Acrobat. While meta data doesn’t have a high impact on SEO anymore, I like to think of the meta description as your opportunity to craft just the right description that will compel a searcher to choose your website in the SERPs, and I’d rather write my own description than have a search engine choose it for me.

Acrobat-Meta-Data

3. Contain Links

Like web pages, PDFs can also contain links, and those links can be followed by search engine bots. These links can contain anchor text, as well.

4. Indexable Content

Perhaps the most attractive pro of using PDFs is that the content within the PDF is generally readable and indexable by search engines. However, not all PDFs have readable content. To ensure that the text is readable, it should be created as text, not as an image, making it ideal to create the PDF from the originating program, like Word or Illustrator.

5. Authorship Applied

Also like HTML pages, authorship can be identified and inferred by Google for PDFs. However, as with HTML pages, authorship will only show for the first author listed, so it’s important to be sure that the preferred author is listed first. Also, the PDF must be an identified “contributor” site in Google+ for that author.

The Cons Of Using PDFs

There are a number of drawbacks to using PDFs when it comes to navigation and lack of control regarding document length, page content, document organization, code editing, structured markup and tracking.

1. Lack of Navigation

One of my greatest concerns about relying too heavily on PDFs for website content is that PDFs often lack site navigation. This means that when a site visitor arrives at the website, they have no simple way to reach other pages on the site. So if the PDF happens to rank well in organic search and a searcher finds the link and arrives at the PDF, how can that visitor easily access other content on your site?

2. Length of Document

Because it’s so easy to save a document as a PDF file, it’s not common to break up a PDF into multiple, smaller documents. For example, in the case of a whitepaper or report, the PDF could range from a few pages to hundreds of pages. This isn’t really ideal for SEO in some cases because longer documents contain more text and often multiple topics. This means that one PDF document, which will equate to one URL, may contain a lot of content that normally might be broken up into multiple website pages in HTML.

3. Lack of Page Organization/Control

Certainly one of the greatest benefits of using a content management system for a website is page organization and control. PDFs, however, don’t often work within the organizational structures of CMS as pages but rather as downloads. So, relying on PDFs as page content isn’t ideal simply from a page organization and control perspective.

4. Lack of Code Editing Capabilities

Certainly one of the benefits of HTML pages is the flexibility that HTML authors have to edit the website code. For instance, images can be optimized for search through tags and other options in HTML, but images cannot be optimized as well in a PDF. This also makes PDFs less than ideal for 508 compliance as well because you cannot add an “alt” tag to each image within the PDF.

5. Can’t Implement Structured Markup

Structured markup and the rich snippets they can generate have been shown through various studies to improve SERP visibility and click-through rate in organic search. But PDFs don’t work the same way that HTML does — authors cannot apply structured markup to the content because of the way the PDF file type works.

In my estimation, that’s a true disadvantage of PDFs. For instance, what if your PDF contains recipes? You won’t be able to use structured markup around those recipes, therefore excluding those recipes from Google’s recipe view in organic search and preventing those recipes from showing recipe rich snippets.

6. Lack of Tracking Mechanisms

I find the greatest disadvantage of using PDFs to be the lack of tracking mechanisms I can apply to PDF documents. Google Analytics can perform tracking through onclick event tracking for PDF downloads, but other tracking within the PDF is not as simple. Additionally, there may be other tracking mechanisms your site uses, such as a marketing automation system. The tracking code for these systems also would not be able to be added to the PDF.

Unlike with HTML pages, PDFs make it much more difficult to fully understand how a visitor is progressing through your site, which is less than ideal.

Conclusion

In the end, PDFs are clearly not the best option for SEO. This doesn’t mean they are bad for SEO, but they simply don’t put the control for SEO in the hands of the webmaster per se. To realize the greatest benefits from SEO, where applicable, I do recommend moving content from PDF to HTML site pages, giving webmasters greater control, flexibility and the best opportunity at SEO and visibility and tracking advantages.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: All Things SEO Column | Channel: SEO

Sponsored


About The Author: is the President and CEO of Marketing Mojo. She regularly blogs on a variety of search engine marketing topics, often focusing on technical solutions. You can find her on Twitter @janetdmiller.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • Ben Heligman

    Thanks for addressing this Janet. To be honest I thought the content of all PDFs were unable to be read by search engines. How do you know if my .pdf is readable or not?

  • Tim Ruof

    I would suggest using Word or InDesign instead of Illustrator. AI really shouldn’t be used for text based PDF creation in my opinion.

  • Tim Ruof

    I would suggest using Word or InDesign instead of Illustrator. AI really shouldn’t be used for text based PDF creation in my opinion.

  • http://frontandsocial.com/ swinterroth

    I think the biggest con to using PDF in any type of web application is the fact that some browsers (Mainly Google Chrome) don’t always handle the PDF and “crash”. At least, this is my experience.

  • http://www.WPBlogTips.com/ Shahzad Saeed

    OK Janet,
    I do have a question. What if I create a web copy and also provide a pdf ebook with exact content of the web copy so that if someone wanted to read offline they could easily download the pdf? Will it be counted as duplicate content as Google can read both the web and pdf?

  • janetdriscollmiller

    Ben,

    I’ve really never had a PDF that wasn’t indexable. Ideally, if you save your document as a PDF from a program like Microsoft Word, it preserves the text format, making that text readable by search engines. However, if you scan a document as a PDF and are not using OCR, it could be that the text in the PDF is like an image, and therefore the text would not be readable by search engine bots.

  • janetdriscollmiller

    Completely agree. I’ve had a terrible time with Chrome and PDFs… especially printing them from Chrome.

  • RightTech

    I think Google will do OCR on image-only PDFs and index the resulting text. Of course, OCR is not 100% accurate and thus it is always better to start with an e-doc and convert direct to PDF.

  • RightTech

    Firefox (at least on XP) is much, much worse on PDFs than Chrome. It crashes all the time, while I’ve never had a crash in Chrome.

    Also, Chrome and Adobe Reader plug-in (for IE or Firefox) will take text that is a URL and make it click-able. Firefox and Safari do not make this text click-able, so you need a PDF creation process that turns these into actual links.

  • http://www.indiabizsource.com/ Anoop Srivastava

    Janet, When we target the long tail keywords in pdf and share it, it start ranking without promoting. I have checked it by uploading in slideshare and it help to get the traffic as well as leads also. Thanks for your post.

  • http://www.tiptechnews.com/ Rameez Ramzan Ali

    Hello Janet Driscoll Miller,

    Suppose, I make corporate PDF and upload on different slide share sites so Is this good or bad approach?

  • http://leavetown.com/ LeaveTown.com Vacations

    Hi, if I have a blog article that I want to spin into a downloadable pdf that resides in the root directory, this would be duplicate content, yes? So should I use a noindex or canonical URLs for the pdf? Great article. Thanks!

  • Rajesh_magar

    That’s cool! and as of my self is very big fan of PDF documents, because they are so portable, always available in place whether you online or offline.

    But can you suggest some great tools which are more proficient to create graphic reach PDF documents?

  • Rajesh_magar

    Ben I am with Jenet statements! Yes PDF are totally indexable if you generate those as real content not an IMAGE.

    Here is small search queery you can use to make it prove.

    “Site:yourdomain.com “some text you have added in pdf” or
    “site:yourdomain.com some text you have added in pdf filetype:pdf”

    Hope that helps!

  • Rajesh_magar

    Hi Shahzad,

    Solutions is pretty simple make sure you that you PDF documents is unfollowe, unindex for search engine, robot.txt is great to do so.

    And you done!

  • Rajesh_magar

    Hi Rameez,

    Definitely there would’t be any problem with that, if you taking care of duplicate content problem.

    In-fact content reformation is one of the great technique and really effective too.

  • http://www.tiptechnews.com/ Rameez Ramzan Ali

    Yes, you are right recently Matt Cutt also said that over internet there are 25% content are copied so we need to think about that

  • Zach Stone

    Great post Janet, 100% agree with everything you stated. One reason I am not a huge fan of pdfs is that unless I am mistaken you can’t redirect them if you have outdated information or you want to take them down for one reason or another. So, if another website picks up the pdf and you have to take it down you can’t redirect them back to your website or another more relevant piece of content. The ease of navigation is also an issue for me. PDFs definitely have benefits and can be used in a variety of different ways, thanks!

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide