Block, Stop Feed Scrappers Automated Blogs To Steal Your Content By Delaying Blog, Website RSS Feed

These days there are so many feed scraping websites which will just steal your content or blog post article through your website feed and then post it on their website and surprisingly they will index in google search much more fast than your website, and when you search the same article of yours on google search you may see these feed scrappers whole has just stolen your article ranking higher than your website which has the original content. In this way they steal your ranking on google and other search engines, as shown in the example image below.

feed-scrappers-ranked-higher

Original Website for the same page for the same search query ranked 5th in google search

google-search-origina-website 

Now there are different ways to stop these feed stealing copy paste website to stop copying or stop them to rank higher even if you cannot stop them to do so. The best way to stop these websites is to File a DMCA notice and force the hosting servers to suspend such websites but filing a DMCA notice could be a pain when you have so many website on your target who are doing this. Now in order to fight back the first best way to make sure that all these websites automatically link to the original content and your website install a AntiLeech wordpress plugin on your website.

What does AntiLeech Plugin Do ?

AntiLeech does not prevent the splogger bots from accessing your site. It produces a fake set of content especially for them that includes links back to your site (and mine, too, ok?) and sends it only to them. When they steal this content, it appears online just like normal, except now you’ve turned the tables on them and have provided them with useless content.

How Does It Work ?

As soon as you publish a blog post, the copying website will instantly be pinged and republished your post. Then you get a trackback ping to the scrapped url in my wordpress comments section. Note the IP address from the trackback and block the IP number in the AntiLeech WordPress Plugin options. Then when the next time you publish the next article, the next scrapped article was total rubbish. This plugin will fill the scrapper content with rubbish text, removing all useful content, putting in a warning about feed republish with lots of free linkbacks with lots of corrupted code to end with as you can see in the image below.

4-20-2011 10-58-15 PM

Another way to block the IP address of such servers or website stealing your content by following the steps mentioned below.

1. Get the IP of the web site that is stealing your content.

%ping www.trafficboosterpro.com
PING trafficboosterpro.com (74.52.58.162): 56 data bytes

2. Search your logs for that IP address (via SSH).

%cat www.20061231 | grep "74.52.58.162"
74.52.58.162 – – [31/Dec/2006:01:00:38 -0500] "GET /blog/feed/ HTTP/1.0" 200 49330 "-" "TrafficBoosterPRo (+http://TrafficBoosterPro.com/)"

3. Place the following directives in your .htaccess file.

RewriteEngine On
RewriteCond %{REMOTE_ADDR} ^
74\.52\.58\.162
RewriteRule ^.*$ – [F]

The easiest method to block these website copying your content or stop them to copy your content and rank higher then your original website to delay the RSS feed so that when you publish the content on your blog or website these copy paste feed stealing website cannot the fetch the content from feeds as with the help of some plugins you can delay the publishing of your feed.

One such plugin is Feed Delay which stops the a post from from immediately being published on feed, you can add minutes, hours or days delay which applies to all the post published on your blog as shown in the image below.

feed-delay-wordpress-plugin

Another similar plugin developed by a friend of mine who is fellow blogger as well, is Feed Pauser allows you to set a time delay between post publishing and its availability through your feed, to setup the delay you will need to visit manage page and setup the delay between publishing and feed availability

Another way out for those bloggers who do not want to install these plugin on their wordpress based blogs, can add the following code in functions.php this will delay your posts to RSS feed. This is a good way to fix mistakes before they are published to your feed.

/**
 * Publish the content in the feed 15 minutes later
 * $where ist default-var in WordPress (wp-includes/query.php)
 * This function an a SQL-syntax
 */
function publish_later_on_feed($where)
{
    global $wpdb;
    if ( is_feed() )
    {
        // timestamp in WP-format
        $now = gmdate('Y-m-d H:i:s');
        // value for wait; + device
        $wait = '15'; // integer
        // http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_timestampdiff
        $device = 'MINUTE'; //MINUTE, HOUR, DAY, WEEK, MONTH, YEAR
        // add SQL-sytax to default $where
        $where .= " AND TIMESTAMPDIFF($device, $wpdb->posts.post_date_gmt, '$now') > $wait ";
    }
    return $where;
}
add_filter('posts_where', 'publish_later_on_feed');

Comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *