A couple of days ago I was working on entering all of the people I found during my Pop-Tart-fueled middle-of-the-night research binge. As is often the case when I’m doing data entry, I got bored. Who doesn’t like doing the research more than keying in the results?
I decided I’d indulge just a little bit by googling each person as I entered them. Imagine my surprise when I found multiple mentions of each person! “Oscar Conrad Lauby”, for example, had nearly a full page of results.
Then I noticed something interesting. The first result when I searched for “Oscar Conrad Lauby” was my own post about him. The other results were…also my own post. The words were mine, but they were appearing on splogs. Poor Oscar has been kidnapped and taken into some pretty ugly neighborhoods. My paragraph about finding him in the Minnesota Death Index is now on a site for people who want to meet horny Atlantic City women, another for people who like “casual Latina sex women,” and several other sketchy sites. They haven’t stolen the whole article—just a few lines, often (but not always) with a link back to my site. I’m not sure what sort of SEO strategy involves linking dead people to horny people, but apparently somebody out there thought this was a good idea.
I found the same thing when I searched for others from the same post. I found content from other posts too. In fact, my favorite part of this whole exercise was finding content from a post I wrote about funny things employees did when they failed pre-employment drug tests on a site that provides information and products designed to help you avoid testing positive for drugs. At least the thief had a sense of irony.
I know Thomas MacEntee has done a great deal of work trying to stop content scrapers from stealing content from members of GeneaBloggers. It seems there are more and more of these out there; I’m a little overwhelmed by the number I found when I looked. In about a 20 minutes, I found about 70 sites with at least a few lines of my content. There are probably more, but by then I’d run out of time and patience. I do have Google alerts set up for my name and blog name, but not for every sentence I write.
Some folks think the best strategy is to just let it go, and make sure you have lots of links back to your own content in your posts (something I do anyway). The idea is that you’ll benefit from the links in terms of search engine rankings, so your original content will always be at the top of the list (and with those Oscar Lauby snippets, my post WAS at the top). I’ve resisted this thought up until now…but geez. This is beginning to feel like a part-time job.
What do you think? How do you deal with content scrapers?
Photo by dullhunk