Thursday, December 30, 2010

Tea Spam - Starting with the Most Blatant

Brandon of Wrong Fu Cha has inspired me to write about spam in the world of tea. I am going to follow up with this post later with a post on more subtle spamming techniques, but this post will get the basics out there.

What is spam?


We all know what unsolicited email spam is. But unfortunately, spam is not limited to your inbox: it also occurs on blogs and webpages. I prefer a broader definition of spam. Wikipedia has a good page on spam blogs, which is not the same as blog spam (which is the leaving of unsolicited advertisements in blog comments). I also consider sites to be spam if the site is exclusively oriented toward selling a product.

In the world of tea, many spam sites center around selling green tea or oolong tea (usually spelled wu long, or presented as wu yi tea) as a weight loss product. These sites overlap a lot with sites selling the acai berry.

Spam Sites and the Squeeze Page:

If you've ever searched for tea online, and probably even if you haven't, you're likely to have encountered spammy websites promoting weight loss products. Here is a screenshot of a typical spam site:



This is an example of what is called a squeeze page: the page looks rich, filled with lots of different images and text, but all of it points visually to a single link, which is selling a product. The only other outgoing links on the page are typically to ads. This way, the owner of the site either feeds the person through to a payout page, or earns money when the visitor to the site clicks an ad to another site (pictured on the right of the above screenshot).

Spam Blogs and Stolen Content:

Besides the overt squeeze page, a number of spam blogs operate by posting other people's stolen articles, text, and images. The articles are usually taken from other websites, often by automated scripts, and are then posted in the blog. Different spam blogs serve different purposes: some want to make money through advertisements or affiliate links, whereas others serve to promote other websites selling a product or making money through ads or affiliate programs.

Is this a problem with tea-related topics?

Absolutely. There are so many spam blogs in the topic of tea that it renders google blog search almost useless. This is especially true of green tea, due to all the health hype on this topic. If you check a google blog search on "tea -party" (filtered to avoid tea party political blogs, which otherwise dominate the results) you see mostly spam. A search on "green tea" is even worse.

Why is this a problem worth dealing with and not just an annoyance?

All this spam makes it harder for people to locate what they're looking for and find accurate information about tea. In addition, the spammers are earning money -- and our society would be better off if that money were instead in the hands of people who were providing a valuable service to society rather than just junking up the internet.

Dealing with spam sites:

The most important thing about spam sites is to not link to them, and not link to articles that link to them, as this indirectly helps promote the sites. This may seem common sense, but I routinely see newer bloggers and casual internet users falling into this trap. But there is more you can do to actually crack down on these sites.

If you ever see a spam site hosted at wordpress.com or blogger.com, you can use the built-in facilities of these blogging sites to report the blogs as spam. Blogger displays a "Report Abuse" option in the toolbar at the top of each blog. If you don't see this link (some spam bloggers use clever javascript code to disable this feature), you can go directly to Blogger's page to report spam blogs. For wordpress, you must be logged on, and then under "Blog Info" on the toolbar you can select "Report as spam". Wordpress in particular is very good at quickly cracking down on these sites.

If you see a spam site returned in google search results, you can also submit a Google spam report. This is only appropriate in some cases, such as if a site is overtly violating google's guidelines (the checkboxes on that page give a clear sense of how and when this reporting form is appropriate), but when it is appropriate, it will result in google quickly pulling this page from search results.

If a spam site has google ads, most importantly, do not click the ads, as this will generate money for the spammer. However, there is a little link on the ads that reads "Ads by google"; if you click this link it takes you to a page that allows you to report the ads for a violation of google's guidelines. If the website has cleverly disabled this link but you're sure the ads are google ads, you can directly visit Google's page to report an adsense violation.

Does it work?

Yes. Even if you choose to do only one or two of these things and only when it is convenient or very straightforward, you will be helping to make the web a better place. I am consistently surprised by how quickly I see spam sites taken down after I report them. Usually, one person reporting them is enough to get them taken down, often in less than 24 hours.

2 comments:

  1. Great post Alex. It's always so frustrating when I search for something on Google and end up on one of those ridiculous "squeeze" pages that don't have any useful content. It's pretty amazing that so many of these guys out there exist. I'm looking forward to the upcoming post on more subtle spamming.

    On Twitter, do you think stuff like paper.li counts as spam? I'm not sure if i understand what it's for when people tweet out "Daily" newspapers.

    ReplyDelete
  2. I've never paid enough attention to paper.li to assess whether or not it looks spammy. Paper.li does take pieces of content, but it's just excerpts. This is a gray area..

    I think spam is all about the motivation: there needs to be an intent to profit and some overstepping of legal (usually copyright) or ethical bounds for it to be spam; otherwise it's just a sloppy or not useful site. Some sites may have little or no original content, and be not very relevant, but still not be spam. I'd even say if a site steals content alone, this doesn't suffice to make it spam.

    But I'm actually hoping to follow up with a post in a little while about these "gray areas" as there are some sites that look very definitively not spammy that may actually be using more solidly spam techniques than one might think.

    ReplyDelete