Monday, October 3, 2011

Tea Spam: "Boutique" Spam

This post is part of an ongoing series about tea spam--unsolicited advertisement on the internet, relating to the topic of tea. If you did not see it, I recommend reading my original post Tea Spam: Starting With The Most Blatant, in which I introduce the concept of the spam blog. If you don't wish to read it, a spam blog is a blog which uses automated software to steal content from other websites -- plagiarism and copyright violation -- and post it on a blog. The blog acquires readers and traffic from search engines, and makes money off advertisements.

Spam blogs are often run by automated software, so, even if the rate of income / profit from a given blog is very low, a spam blogger can create thousands of blogs and earn a considerable amount of income from stealing other people's work.

I've actually had some victories shutting down spam blogs, which I outline in that blog post, which also gives tips and guidance on how to get these blogs shut down.

Boutique Spam:

While a lot of spam blogs look, for lack of a better word, "spammy" (hastily constructed, and immediately evident to a trained eye that they are automated). Some time ago, Brandon of Wrong Fu Cha brought to my attention a phenomenon that he calls "boutique spam". Below is a screenshot from a spam site which makes daily spam posts. But this site has a professional-looking layout and is extremely well-designed:



At a glance, this site looks totally legit. It has a twitter account with a huge following (over 43,000 followers), and a facebook page. The site is continuously updated with new articles about tea. But...something is suspicious; who in tea has that many legitimate twitter followers? Even Tony Gebely (arguably a big name when it comes to tea on the web) only has 22,000-some. And if you look carefully at the articles, you see something very suspicious: a slightly unnatural wording or phrasing of the text. Here's an example:



Note the headline (click the image to see the full text up close) with the grammatically correct, but extremely awkward sounding phrase "A Brief Introduction Towards Blooming Tea". No human would ever write this. But a person might write "A Brief Introduction To Blooming Tea". This raised suspicions for me...it seems like automated article spinning, in which software automatically replaces words with synonyms, so that search engines will not be able to recognize the article as being the same as whatever original article it was taken from. This "article spinning" has two benefits to the spammer: (a) it allows the spammer to avoid detection by the copyright holder, and thus, avoid legal action (b) it allows the spammer to enjoy treatment as having "unique content" by search engines, which preferentially index unique content and generally avoid indexing or highly ranking duplicated content.

Finding the Original Content:

Finding the original article can be a bit tricky in some cases. Typing the title into google, replacing "Towards" with "To", immediately turned up some results which are obviously the same article. Interestingly, it was hard to find the original article, however, because the results I found seem also to be more spun articles on spam blogs. But...the degree to which this article has been duplicated throughout the web, and the fact that the wording on this article or blog post is so unnatural demonstrates without a doubt that the content is certainly not original and did not originate on this blog.

Shutting down spammers: what can you do to help?

We can all do our part to prevent web spam. Here are a few tips; the first two are the most important.

  • Don't judge sites at a glance. Look a bit deeper before passing judgment. It takes a bit more time, but ask yourself: do you really want to be duped?

  • Be cautious of what pages you link to, who you follow on twitter, which blogs you subscribe to or add to your blog's blogroll, and what you like on facebook.

  • If you encounter a spam blogger using twitter, block and report them using the button / feature on twitter. Same goes for Facebook accounts--there's a "Report" button at the bottom of profiles.

  • If discussing a specific spam blog or spam site, do not link to it even in discussing it as spam - it is best to only include a screenshot as I did in this post. This ensures that search engines do not follow the link to the site and end up thinking either that the site is legitimate, or that your blog too is promoting spam.

  • Consider some of my tips on shutting down spam blogs, including emailing the domain host, web host, ad host, and reporting the site as search engine spam if it is appearing in search results. Here's Google's page to report webspam.


I'm also curious: had you encountered this site? Did you recognize it was a spam blog? It actually fooled me at first glance, and I had followed its twitter account, so don't feel bad.

No comments:

Post a Comment