Spammers are employing a new tactic to attack blogs, and it’s a tactic that could bring down anti-spam measures protecting not only the Blogosphere, but e‑mail too.
Blog spammers are attacking blogs with their typical messages, but they have a new, ingenious, and potentially catastrophic trick. They’re incorporating links to legitimate, respectable domains into those attacks. The net result is that automated spam filters, even so-called “smart filters” like Dr. Dave’s Spam Karma plug-in system for WordPress-based blogs, are Blacklisting domains like CNN.com, IMDB.com, MacCentral.com, and dozens of others.
The anti-spam logic engine on one of my high search engine ranking sites, for example, Blacklists 15–25 respectable domains daily. Should I actually get a comment from someone at MacCentral.com, for example, the anti-spam system will kill the comment before I see it. In the past that site has received comments from people working for, and e‑mailing from, MacCentral.com.
Although someone might have already coined a different term, I call this type of spamming Whitelist Attacks.
This is only the beginning of the Whitelist Attacks. Beginning with one or two per week 90 days ago, Whitelist Attacks are now up to an average of two dozen per day on each of my Web sites and those of several other professional bloggers. Whitelist Attacks are effective, and their scope and frequency is increasing.
Whitelist Attacks are designed to accomplish two goals:
1. Exploit Whitelists of respectable domains to sneak past spam filters, and;
2. Cause a sufficient number of erroneously Blacklisted domains that bloggers and e‑mail administrators abandon automated filters entirely.
While the top spam filtering engines are currently too smart for goal #1 to work, #2 is quickly becoming a reality.
Follow Whitelist Attacking through to its logical conclusion: Spam bots record where they attack–let’s say they try to hit YourBlog.com. Once they’ve found and attacked that site, they can easily incorporate that domain into their future attacks against other sites. Therefore, hundreds (and eventually hundreds of thousands) of automated spam filters will begin Blacklisting YourBlog.com. In the end, the blogging community will be crippled by the fact that we’ve all Blacklisted each other’s domains.
The effects won’t be limited to preventing bloggers from commenting on each other’s blogs.
The Blacklists generated by blog spam filters are frequently shared–even among non-bloggers–and often exported for use as e‑mail Blacklists. Imagine some, then half, then most of your personal and professional e‑mail being undeliverable because an automated system has your domain on a Blacklist.
Blacklists are also often published online–viewable by the public and indexed by search engines. What would be the damage to your reputation of being publicly labeled as a spammer? If you work at the Gap, it probably wouldn’t bother you too much–it might even raise your street credit. But if you’re a professional…
Three-List Exploits
Current anti-spam systems typically evaluate blog comment and e‑mail content looking for bad words, known bad domains, or the incorporation of more than an arbitrary number of links. If the systems find any one of these conditions, they stop the message from getting through by placing into a moderation queue or mail folder, or by killing it outright. The more advanced of such systems operate on a three-list principal:
Whitelists are known good domains and terms, and are usually administered manually–a human must deliberately add to the list a condition that, if met, passes the message through without further challenge.
Blacklists are known bad domains and terms. These are the pharmaceutical, adult, and online game terms we all know and despise, and those (whom we despise even more) who peddle them through unsolicited e‑mail and blog comments. Messages containing domain names or terms on the Blacklist are stopped and held or deleted before delivery.
In between the two extremes of always deliver (Whitelists) and always stop (Blacklists) are Greylists. Greylists contain terms or conditions that may be marks of spamming, but may also be innocuous–the evaluation of which is too complicated for current technology, and must be left to a human. Once a message meets Greylist conditions, it is segregated from the rest of the e‑mail or comments, and placed into a moderation queue or special folder for later human evaluation. The e‑mail administrator or blogger will then manually enter the queue or folder and review the message content to make a final determination of its fate.
Whitelist Spamming and Whitelist Attacks attempt to use the three-list system against itself by either slipping through on a Whitelist approval condition, or by causing so many false positives that denials and segregation based on Blacklists and even Greylists become self-defeating and are abandoned.
Because of the content and quality of their messages, spammers are often characterized as uneducated, stupid, or random and disorganized. Nothing could be further from the truth.
Spamming is a profitable business, with annual global revenues measured in billions of dollars. While some spammers are the uneducated morons who believe every get-rich-quick scheme Carlton Sheets tries to sell them on late night television, they are the not the ones from whom you will typically receive spam. The majority of spam comes from large, exceptionally organized, and highly motivated syndicates whose numerous crimes are grounded in the real world concerns of drugs, guns, and racketeering. Spam and spam-related activities are merely one of their business interests. These organizations have virtually unlimited funding for research and development of new techniques and methodologies to defeat anti-spam measures, and they employ some very intelligent people for that purpose.
Those who perpetrate Whitelist Attacks understand how computers, the Internet, and your mind operate. They realize the limitations of three-list anti-spam techniques, and, more to the point, they recognize that administrators of such systems are too busy to baby sit them. Whitelist spammers know that the more time they force us to manually scrutinize our automated White‑, Grey‑, and Blacklists, the less useful those lists become. Automated systems only work for us so long as they remain automated; the moment we perceive administration of those automated systems as becoming more labor‑, time‑, or mentally-intensive than our perception of dealing with spam at the inbox phase, we will abandon those automated systems entirely–thus opening the flood gates to spam once more.
As spammers well know, three-list filtering is the most effective and accessible anti-spam methodology currently available. In the eyes of the professional spam industry, three-list filtering on blogs and mailboxes is the single largest impediment to growing their bottom line. Beating it is their highest priority. With Whitelist Attacks–simply adding one more URL to their messages–they have indeed found an easy, effective, and low-cost way of defeating three-list spam filtering.
Someone needs to find a way to combat Whitelist Attacks–and they must do it swiftly. More advanced algorithms need to be devised, algorithms that evaluate the style, structure, and verbiage of blog comments and e‑mail messages, but that also have the ability to recognize and extract reputable domains. Global Whitelists must be created to prevent the automatic addition of all domains referenced in a spam message from being added to Blacklists. If an evaluated message contains adult-oriented text and a link to a domain that meets rule definitions as being undesirable, but just happens to have a spoofed return address of Steve.Jobs@Apple.com, the automated filters protecting the mailbox need to be smart enough to add the spam domain to the Blacklist for future matching, but to not add Apple.com to the Blacklist
Whitelist attacking is an ingenious response by professional spammers to the most advanced anti-spam systems currently protecting blogs and e‑mail inboxes. It’s a methodology that carries grave consequences to hundreds of thousands of bloggers, and whose effects will, if left unchecked, cripple the Blogosphere. More grave still, the reach and potential damage of Whitelist Attacks hits e‑mail filtering systems equally and threatens the Internet far, far beyond blogs.
Am I being naïve to suggest that whitelist domains such as Apple.com in your example above simply be given “protected” status? There are user-defined ways to identify truly worthwhile sites (StumbleUpon.com being one popular application of the technology) and separate out the drivel, or worse.
Hi, Matthew.
See, that’s just the problem of whitelist spam: If you protect Apple.com, then any spam message that includes that domain would automatically get through. That’s exactly what spammers are hoping for, which is why they’re including sites like Apple.com in their messages. Three-list anti-spam engines aren’t yet smart enough to figure out what to do with a message containing two URLs, one being unknown to it (Apple.com) and the other known bad (a porn site, for example). In those cases, the anti-spam engine creates an association between the known bad and the unknown, deciding that the unknown must be had and therefore should be blacklisted.
While it is feasible for humans to go in and individually whitelist good domains, it’s totally unweildly to whitelist the millions of respectable domains out there. While most sites will never get a fraction of those as blog comments or e‑mail, there are still thousands of potential domains from which desired messages may come. One cannot whitelist them all, nor can one realistically investigate every blacklisted message or domain on a busy site.
See the problem now?
Yeah. And I definitely think you’re on to something that’s not getting much ink in mainstream press, but probably should be. (Or would that only make things exponentially worse?)
Spammers tend to share information like any other profession. I don’t think mainstream press coverage would exacerbate the problem. It would, however, get more people working on a way to combat it.
Pingback: thinkcreation.net » Blog Archive » Whitelists in Spam Filters Might Become Spam’s Best Friend
I like your site.
Blogs with comments windows that you have to click to open keep the dialogue under wraps. Blogs like this one which string comments out in the open are much more proactive about sparking dialogue.
See the Sender Policy Framework (SPF) at http://www.openspf.org. If the email says it’s from Apple.com but the sending email server doesn’t match an allowed IP address for one of Apple’s listed email servers, it doesn’t get through. The SPF check needs to be before the whitelist check. It’s not perfect, and subject to a DNS attack, but it would make the spammer’s job more difficult if every email domain had it implemented.
Someone else below asked this already about antispam scripts.
I am getting nailed with Spam on my website mails and in our blog website – now its offline too
much spam. Is there anyway to stop this? If not, there really isn’t any point in leaving it up
and active. Any help will be greatly appreciated.
Thanks for help, Keep up the good work. Greetings from Poland
Pingback: Coding Horror
Pingback: Web Marketing Services - Professional Internet Marketing by SEO Prodigy