Identifying Invalid Website Traffic
On December 18th, 2018 at 18:00 PST, my blog was the target of a spam bot attack..
My website request rate spiked from an average of 100/hour to 2000/hour in the course of 3 minutes. The traffic was completely inorganic (0:00 spent on site), 99.99% bounce rate, and none of my webpages had increased user counts, according to Google Analytics. As a result of this invalid spike in traffic, Google Adsense shut down my adsense account completely, and every attempt I made to contact them for a reinstatement was returned with:
After thoroughly reviewing your account data and taking your feedback into consideration, our specialists have confirmed that we're unable to reinstate your AdSense account.
I was left vulnerable to a mysterious attacker, shunned by the Google Adsense team and my profitable blogging business model hit rock bottom at $0.00 earnings. You can read about how I found a new advertising platform in this article.
What is Spam?
In all of its variations (email, forms, clicks, etc), spam is a bullish attempt to invoke a pre-meditated response from a wide selection of individuals. Most commonly, spam tries to gather personal information (credit cards, phone numbers, etc) from people in the form of spam emails, pop ups on websites and more. In my case, spam traffic bots were successful in pinging my website thousands of times in the course of a few minutes, most likely for the purpose of putting me in a financial hole (all those data transfer charges from AWS) or perhaps to stunt my partnership with Google. Some spam; such as newsletters, can be avoided indefinitely, but others are more cunning and change their appearance to slide past security protocols, undetected.
Identifying Invalid traffic
At first I was clueless as to what caused this dramatic increase in web traffic. The first indicator was actually the email from Google about my Adsense account being shut down, as I was away from my laptop during this time and not actively monitoring my site. I immediately went to Google Analytics to see if I could identify this spike in traffic. The first thing I saw was a huge spike in website users, which I was able to hone in on by filtering by date > today. Google Analytics is also a great tool to identify your website ranking in search results.
The interesting thing is that when I tried viewing which page these users were landing on, there was no data. The user count per-page did not reflect the increase in traffic at all, which tells me the bots were sending requests to the site and never actually loading the page. Definitely a vulnerability in design on my end. I was also able to spot the increased request rate from the Cloudfront distribution console in AWS, which is where I handle all my hosting needs.
The first thing I did was set up numerous alerts on the AWS side of things. Cloudwatch is the alarm service that they offer, so I set up several alerts to notify me of future invalid activity. In turns out that Google Adsense actually has a form you can submit when you think your site has invalid traffic. I suppose by tipping them off ASAP, you prove to them that the traffic is not caused by you and they take that into consideration when evaluating if your account needs to be shut down. Had I known about this form sooner, I would have submitted by claim right away. Lesson learned, I guess.
After configuring numerous alarms, I turned to my cloudfront distribution for throttling request rates. Because my website at the time never exceeded 200 requests per minute, I set the threshold there and crossed my fingers that it wouldn’t hinder user experience. Since then, I’ve raised this limit up to accomodate for increased (organic) traffic.
When I turned to possible security measures on Google Analytics (since the Adsense console was locked for me), I found something interesting, Upon further investigation of the invalid traffic, I was able to identify that the network source was none other than amazon technologies inc. After some research and finding others who have had similar occurences, this appears to be the source of a known bot spamming issue stemming from the Ashburn, Virginia region.
Ashburn is home to one of, if not the, largest cloud computing datacenters in the entire Amazon network. I hypothesize that one of their clients is misusing the hardware and sending bot traffic to drown people’s websites. Don’t worry, I’ve already submitted a claim to the AWS abuse team about this.
Through this experience, I’ve learned a lot about the security of my website, as well as the unforgiving nature of Google (I don't blame them though). I hope my misfortune has offered some insight into how you can monitor your resources more closely and, of course, be reminded to take security seriously.
If you found this article informative, please consider sharing using the social media icons below. To read about my journey to a new advertising partnership, check out this article. Have comments or thoughts? Let me know on Twitter!