Why you Need Spam Domain Referrals & Bot Filtering if you are a Data Analyst

Data pollution is something that may ruin the day as well as report of any data analyst. De-crappifying data – as Annielytics puts it – is a must if you are using Google Analytics. There are lots of ways to do that but nowadays excluding the spam domain referrals is a must for any data analyst.

In Google Analytics Referrer spam occurs when a website receives fake referral traffic from spam bots. You may check that by logging into your analytics account and then visiting Acquisition, selecting all traffic and then clicking on referrals. This fake traffic that the Google Analytics records could lead to serious data corruption as you may see the number of visitors spike by as much as 100%.

Before I go on lets understand what a bot/spider actually is. A search engine bot or spider is an automated program that crawls/hits your website for various reasons. It may be a search engine like Google looking to crawl and index your site and content (provided you have allowed it in your robots.txt file) or it can be a simple program that is checking whether you have updated your blog/offers or anything that may might be looking for. It may also be a service that is hitting your site to know whether it is up and running.

Some bots – the kind ones – don’t run codes on your sites and done appear as hits on your Google Analytics. You could easily exclude them in your Google Analytics setting. Just go to admin panel, click on View Settings and select the check box [Exclude all hits from known bots and spiders.

These malicious bots those who trigger your scripts and codes and those hits gets registered in your Google Analytics as visits. The purpose of this bots could be many including and if you are a data analyst they would make your life miserable by spiking up historical data in your Analytics and affecting your sampling.

But for a data analyst the problems don’t stop here because it is more than just traffic or session spike. Some of these bots especially the ones you pay for can even log into your site and pretend to be a specific audience segment.

Not filtering out these bots could seriously mess up your data. You might be looking at a specific audience segment, and see a wild swing in traffic. Similarly spam domain referrers might be corrupting your number of sessions and users.

This is how you can block it in your Google Analytics by using filters. Since Google Analytics allows only 250 characters in one go you may have to create more than one filter. Here are some of the most malicious spam domain referrers:

Spam Domain List


Spam Domain List #1



Spam Domain List #2


Spam Domain List #3


Spam Domain List #4


Spam Domain List #5


A word of caution if you are using Referral Exclusion List in your Property View! Don’t use it as it may just change those spammy domain referrers into direct traffic. It won’t even let you know who those spammy domains/bots were and you will end up with hours of delving deep into your server log to indentify these malicious bots.

Also, I would advise you to create a new view in Google Analytics say with a name Spam Domain Block and then applying these filters.

You should also add an Annotation in Your Google Analytics main view if you are applying these filters there to get a picture of the change that has happened by comparing your historical data to the data after applying the filter.

But blocking these malicious bots that act like users and create referral sessions in your Google Analytics is just one step. You would further want to block them from accessing your site because they are using your server, downloading the data as they crawl your page and make you pay more for bandwidth usage.

The best way to block referrers from accessing your site at all is to block them in your .htaccess file in the root directory of your domain. You can copy and paste the following code into your .htaccess file, assuming you’re on an Apache server. If you want to get creative and get back to them, you can also redirect the oncoming traffic back to their spammy referral domains themselves.




RewriteCond %{HTTP_REFERER} example1.com [NC,OR]

RewriteCond %{HTTP_REFERER} example2.com [NC,OR]

RewriteCond %{HTTP_REFERER} example3.com [NC]

RewriteRule .* – [F]

Or yet another Example


RewriteCond %{HTTP_REFERER} ^http://.* example \.com [NC,OR]

RewriteCond %{HTTP_REFERER} ^http://.* example \.com [NC,OR]

RewriteCond %{HTTP_REFERER} ^http://.* example \.com [NC,OR]

RewriteCond %{HTTP_REFERER} ^http://.*example\.com [NC,OR]

RewriteRule .* – [F]

Additional Resources:




Special thanks to http://www.analyticsedge.com/2014/12/removing-referral-spam-google-analytics/ for the exhaustive list of Spam Domain Referrers List. I have also added some new bots in the first list.


(Visited 346 times, 1 visits today)

Leave a Reply

Your email address will not be published. Required fields are marked *