Blog scraping

Blog scraping, is the process where automated software scans hundreds of thousands of blogs per day, searching for and copying content. The process is sometimes referenced by the name given the software or individuals responsible for the action, “blog scrapers.”

"Scraping" essentially stands for copying, or in the case of copyrighted material, stealing content off a blog that is not owned by the individual initiating the scraping process. The scraped content is often used on Spam blogs or splogs.

Dangers

If a blog scraper is gathering content that is copyrighted material, it is a violation of law. In addition, there are a number of more practical problems that blog scraping causes for the person or business who owns the blog. Blog scraping is particularly worrisome for business owners and business bloggers. Scrapers can copy an entire post from an independent or business blog. The duplicated content will include the author's tag and a link back to the author's site (if that link appears in the author's tag.)

However, most blog scrapers copy only a portion of the content that is keyword-relevant to their splog topic. By doing this the keyword relevancy of the scraper's site is increased. Secondly, by not scraping the entire post, any outbound links are eliminated which means their search engine ranking is not reduced.

Additionally, scraped content can appear on literally any type of splog or RSS-fed spam site. This means an unsuspecting individual could find their creative or copyrighted material copied onto a site promoting pornography or similar type of content that may be offensive to the original author and his/her audience. This may be damaging to the original author's reputation.

Helpful Links

WordPress Feed Copywriter Plugin

Six Steps to Prevent Content Theft and Combat Copyright Infringement on Your Business Blog

Behind Splogging: Why Sploggers Splog