Last month, I shared a case study of a client I’m currently working with on a duplicate content issue. It turns out that this particular site had significantly lost rankings over the past year because of other sites “lifting” their content, often verbatim, causing the client’s site to lose rankings through Panda updates.

This is a common problem for content producers. You produce great content only to have other sites disregard copyright and repurpose the content on their own sites — without attribution. I see this often in the case of associations and non-profits that may be producing valuable research or information that other sites may want to share. In some cases, the infringement may be unintentional… but in many cases, it isn’t.

So, how can you recover when your content is stolen? First, understand that this isn’t a quick fix or fast process. However, it’s the process that I find works best with Google.

Step 1: Find The Infringements

If you believe you’ve been hit by a Panda update and are seeing dramatic traffic losses on certain pages, I would prioritize looking for duplicate content that corresponds to those specific page losses.

To get started, copy a few lines of content from the page on your website and search for that content as an exact match search in Google by putting it in quotes. If you find pages other than your own that appear with this exact match content, it’s time to find out exactly how much of the content on these pages matches yours. You can also use Copyscape’s Web search to see if it readily identifies other copies of your content on the Web.

Copy the URL from your page and the URL from the suspect page and paste them into Copyscape’s Content Comparison tool. This tool looks at both pages of content, side-by-side, and indicates the percentage of overlap in the two pages’ content. My rule of thumb is that anything over 50% really does need to be addressed immediately. However, you’d be surprised how often we see 90-100% duplicate content.

Step 2: Log The Infringements

As you find duplicate copies, log the information in a spreadsheet, including the percentage of overlap. If you find a site that seems to have copied any of your content nearly verbatim, focus in on these sites and see what else you can find on them. I’ve typically found that sites which duplicate one page of your site don’t stop with just one page. Log all of the pages you can find from the sites that have duplicated your site content.

Also check the Wayback Machine to see how long the infringing site has been using the copyrighted content. Go back as far as you can to get a full understanding and log of information about this infringement (you may need the information later).

Step 3: Reach Out To The Infringing Site Owners

Next, you’ll want to reach out to the infringing site owner(s). I generally start with a friendly email alerting the site owner about the infringement and politely asking the site owner to take the pages down. I also request that the site owner respond to the email, letting me know that the content is down, by a certain date. List out all of the pages on the owner’s site that are in violation and that you would like removed.

How can you find out who owns a site? If the site itself does not provide contact information, check out who owns the domain through the WhoIs lookup. The site owner may have his/her contact information hidden; but if not, you can see the individual to contact and an email and mailing address.

Generally, the email is enough to get the owner to take the infringing content down. However, if it’s not, you may want to send a more strongly-worded letter via postal mail. In this case, it may also be helpful to have the services of an attorney who can send a legal letter on your behalf.

dmca-google

Step 4: If the Pages Are Not Removed

After all of your efforts, if the site pages are not removed and you cannot resolve the conflict with the site owner, it’s time to hard ball.

Copyright infringement on the Web is a violation of the Digital Millennium Copyright Act (DMCA) (pdf). Google and Bing will take down content that is in violation of DMCA, but they do request that you attempt to contact the site owner to resolve the issue first. The forms you’ll need to fill out are:

The Wrinkle Of Content Syndication

If you’re syndicating content or giving another site permission to copy your content, it can be tricky. If the content is completely duplicated, you (and the site using the content) risk being affected by Panda updates. Sometimes, too, Google misinterprets the original content creator or doesn’t rank the preferred version highest. Google’s advice on syndicated content is:

Syndicate carefully: If you syndicate your content on other sites, Google will always show the version we think is most appropriate for users in each given search, which may or may not be the version you’d prefer. However, it is helpful to ensure that each site on which your content is syndicated includes a link back to your original article. You can also ask those who use your syndicated material to use the noindex meta tag to prevent search engines from indexing their version of the content.

Protecting Your Content Long Term

While I know that all of this may seem daunting, ultimately it is the responsibility of the copyright owner to protect his/her copyrighted material. One of the first ways you can do this is by making your copyright very clear — add the copyright icon and year to each page of your website. I also prefer to see the beginning year to the present year following the copyright so that it is very clear when the copyright was established.

Another tool you can use to catch duplicate pages before they cause you Panda issues is the CopySentry tool from CopyScape. This tool, for a small fee, will continuously monitor certain pages you identify on your site and notify you when duplicates are found. If you have pages that tend to be more popular or have been duplicated in the past, I’d prioritize these pages for monitoring.

All in all, the process takes time. Time to research the infringing pages, time to document, time to contact site owners, time to report to search engines and time to see recovery (even when infringing pages have been removed). It can be frustrating, but it’s a necessary process to protect your content and keep your organic rankings strong.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: All Things SEO Column | Channel: SEO | Google: Panda Update | Google: SEO | Legal: Copyright | Panda Update Tips

Sponsored


About The Author: is the President and CEO of Marketing Mojo. She regularly blogs on a variety of search engine marketing topics, often focusing on technical solutions. You can find her on Twitter @janetdmiller.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • Hammad

    What if my content is genuine and ranks high but lots of people have copied it. Will i be affected by any update/Panda?

  • Dharitri

    No, your site will not affected by any update/Panda.

  • Hammad

    Yes thats what i think too but i assume all the other sites may face some sort of penalty.right? but on the other hand we may get some link backs from these sites too. What is the importance of those links? Will they add some value?

  • SuperTJ

    This article is scary weird about this issue. I hope Dharitri is correct.

  • Hammad

    She is correct:) If a site is ranking higher there should not be an issue else i can start taking sites down by creating clones:) but it is worth doing an experiment to evaluate the power in those back links!

  • Vinay Kumar

    What to do in case no contact information is found at the site and who is is protected. Mostly those kind of people steal the content

  • Hammad

    Still it wont make much difference if you are the first one in Google Index and your piece is unique!
    Although if you are not on top in results then surely its a problem. In this case DMCA is the option.

  • The Cash King

    I just did a CopyScape search for a page on my site and found that 7 other sites were using as much as 14% of my content! This is ridiculous. How and why would they be allowed indexing?

  • Hammad

    This is not a very high percentage so nothing to worry much. Google does not throw these sites out of index until reported and found guilty.

  • james

    I tried contacting a website (ecommerce site that in direct compittion to mine selling similar things to take down my content they stole…….. i got a rude response saying why should we……. i then because these people are from china tried contacting the Chinese web host and got ignored…… i then got google to remove dmca the web pages….. but there are so many they have ripped.plus the cheeky swines also publish the same articles on web 2.0 blogs as well all automatically…. linking back to there rubbish website… Links to my website get removed automatically…. but because they use scrapers stuff like here at.. add my website name here*** are still present…. and stuff like that. There seo is basically stealing others duplicate content google doesn’t rank there website at all..lol but recently my website been taking a slight hit for some reason…… Ay advise please?

    thanks

    (oh yeah and just about everyone else who write articles that mention product name is stolen to not just mine.)

  • Lucille Ossai

    Thanks for sharing!

    An unscrupulous individual copied 18 of my content, word for word, including images etc. and posted under his name on his blog, with no attribution/credit whatsoever to my blog and without my content.

    Fortunately I had learnt about the Google DMCA a week prior to the discovery and as my blog had the Copyscape widget, I was able to identify the culprit.

    I proceeded to contact Google and to file for the DMCA. Of course I had to prove that I wrote all the articles, which I was able to do by copying the original URLs specific to each article and forwarding to Google along with other information. Google was fantastic and in about two months, all my articles were removed from the offending site, which was “frozen” soon afterwards. The offending blog still appears on the Google search engine but with no content whatsoever, neither mine nor his so it might as well be non-existent.

    I was lucky. If I hadn’t used the free Copyscape widget to check for duplicate copies or if I hadn’t done manual Google searches under my name and blog, I may never had been aware of this injustice!

  • kimberly537

    my neighbor just got a fantastic red Chevrolet SS Sedan just by some parttime working online with a macbook. check w­w­w.J­A­M­20.c­o­m

  • Hammad

    If you can give me reference of your site and copied content i may help you. you can also write to me on hammadrs@gmail.com if you don’t want to share it in public.

  • janetdriscollmiller

    Actually your ranking doesn’t matter. You can still be affected by Panda, as in this case. It’s all a matter of who Google perceives as most “authoritative” and the original author. Regardless, to be safe, I would monitor the content to be sure.

  • Hammad

    Ranking does matter! How would you know that you are/can be penalized? If your content stays on top of the results then it is an evidence that Google treats you as an original author!

  • http://www.cendrinemarrouat.com/ Cendrine Marrouat

    Great article, Janet, thank you!

    Content scraping is such a widespread practice now. Thankfully, we have a lot of tools to protect ourselves.

    I wish people realized how frustrating it is to have to contact the infringers through email and social networks and then file a DCMA!

  • creature77

    Here’s a twist on copyright infringement that I’m not sure how to handle.
    Someone copied my entire web page verbatim and then converted the page to a PDF file and promoted it as a “free download”.
    Now, I can find (through Google search) at least 20 sites offering that pirated PDF as a free download – with the original thief’s title.
    Since the download access page only contains the link but not the entire text, can I get the link pages removed through a DMCA request?

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide