-
Contents
What is Spam Google Analytics Traffic?
Did you know anyone can send bogus traffic to your Google Analytics property without ever visiting your website! All they need is your unique tracking ID.
The majority of spam is generated automatically by sending data directly to the Google Analytics servers without ever touching your website. And if you were wondering how spammers find your unique tracking ID... they guess!
The Google Analytics tracking ID uses the following format, where each X represents a number, UA-XXXXXXXX-X making it easy for spammers to create a big list of possible IDs to loop through.
What Does Spam Traffic Look Like?
The easiest way to identify potential spam traffic in your Google Analytics view is to navigate to Audience > Technology > Network and select Hostname as the Primary Dimension.
If there is spam traffic, you'll see websites that you don't recognise as well as "(not set)" in the list of hostnames.
How to Eliminate Hostname Spam
The fix is a relatively simple one. Because the spammers have guessed your tracking ID they don't know which website it actually relates to. Implementing a hostname filter will remove any traffic that didn't originate from your website.
It's also worth pointing out that filters are not retroactive and will, therefore, only filter spam traffic going forwards. Any historic spam traffic will still be shown in your Google Analytics reports.
Test Your Regular Expression
This filter uses a regular expression to match one or more relevant hostnames. Getting this wrong could filter out legitimate traffic and leave you with even more inaccurate data. To prevent this from happening, it's a good idea to test your regular expression prior to implementing the hostname filter.
This is the regular expression we will be using. Update example\.com with your domain name, making sure to use a backslash (\) to escape any periods (.).
^(.*)?example\.com|^(.*)?googleusercontent\.com$
Go back to Audience > Technology > Network and select Hostname as the Primary Dimension. Click advanced next to the search field and set the match type to Matching RegExp. Paste in your regular expression, modified for your domain name, and hit Apply.
The list of hostnames will now show only traffic that has originated from your website or was hosted on the googleusercontent.com domain, which happens when a page is served from Google's cache.
As long as everything looks good, you can now create and apply the hostname filter.
Include Hostnames Filter
From Google Analytics:
In the Admin area select the Filters option under the view you would like this applied to.
Click + Add Filter.
Name the filter Include Hostnames.
Create a Custom filter.
Select Include.
Choose Hostname as the Filter Field.
Use a Regular Expression of ^(.*)?example\.com|^(.*)?googleusercontent\.com$ as the Filter Pattern. If you are cross linking domains use a pipe (|) to separate each domain.
Click Save.
To apply this filter to another view simply follow steps one and two for the relevant view but instead choose to Apply existing Filter and click Add >>.
How to Eliminate Other Spam
For automated traffic i.e. where a bot effectively visits the website, it can be harder to distinguish between fake and genuine visits. This is because instead of trying to send data to Google Analytics automatically, the bot attempts to simulate what a normal website visitor would do and leaves Google Analytics to collect all the associated data.
The key to filtering this type of automated spam traffic is to identify a dimension that is set for a regular visitor but not an automated bot. One such dimension is Browser Size, which often appears as "(not set)" for automated spam traffic.
Exclude Browser Size Not Set Filter
From Google Analytics:
In the Admin area select the Filters option under the view you would like this applied to.
Click + Add Filter.
Name the filter Exclude Browser Size Not Set.
Create a Custom filter.
Select Exclude.
Choose Browser Size as the Filter Field.
Enter (not set) as the Filter Pattern.
Click Save.
Remember, filters only affect future data. They do not remove spam traffic from data already shown in Google Analytics. However, this can be achieved using an Advanced Segment.