Every e-mail system has spam, and Hotmail is no exception. However, our SmartScreen™ spam-fighting system is one of the most effective in the industry. This post will give you the inside scoop on how we use it to fight the menace that is spam, and what you can do to help.
Why do people send spam in the first place? It’s simple: for the money. Spam is big, big business. Much of it is also illegal, but, unfortunately, that hasn’t kept people from sending it.
Believe me, spam is a battle. Spammers won’t ever quit, and they are very clever. Spammers are constantly finding new ways to exploit the services that we provide to our customers, to abuse the very system that you use every day to communicate with your friends and family and to conduct your personal business. But we won’t quit, either.
Various studies, including the monthly spam reports from Symantec, indicate that upwards of 90% of all the e-mail sent over the Internet is spam. As a result, the e-mail sent to Hotmail, just like e-mail sent to other e-mail providers, is mostly spam. With 350 million active users, Hotmail is a big target, and as a result we receive several billion e-mail messages every day.
But Hotmail removes 98% of all spam before it can even reach your Inbox. Let’s talk about how we do that, and how we’re striving to get even better.
First, let’s get some definitions out of way. Spam is the term used widely in the industry for unsolicited commercial e-mail that is sent in bulk and indiscriminately—that is, the sender of the e-mail does not have any legitimate reason to be sending it to the recipients. No one wants spam in their Inbox.
But not all unwanted e-mail is spam. For instance, you may be receiving newsletters or product offers as a result of having signed up on a legitimate website. You may not want them now, but they aren’t being sent to you indiscriminately—after all, you signed up for them! We call this kind of e-mail graymail because you may or may not want it in your inbox—figuring out how to handle graymail is not so black and white (hence the name).
Our goal is to eliminate as much spam as possible. But we need to avoid classifying good e-mail as spam. We call good messages that are mistakenly identified as spam false positives.
So, the real trick is to eliminate as much spam as possible (ideally all of it) while keeping false positives to a minimum (ideally zero). In some sense, these are conflicting goals, which makes it a tough balancing act.
Common engineering wisdom tells us that we can’t improve what we can’t measure.
At Hotmail we track several metrics very closely. We track SITI (“spam in the inbox”) every day, and we track true spam, which is the percentage of SITI that does not include graymail. We also track how often we mistakenly put legitimate e-mail in your Junk folder.
In addition to using automated systems to measure how we are doing, we of course use feedback from Hotmail customers. If you see a message in your inbox that you think is spam, you can let us know by marking it as Junk. If you see a message in your Junk folder that is not spam, you can let us know by marking it as Not Junk, or just by moving it out of the Junk folder.
Typically, about 75% of e-mail reported as spam by people using Hotmail is actually graymail—that is, legitimate e-mail, but something that a customer didn’t want and marked as spam. A good example of graymail is a newsletter or notification that you signed up for by making a purchase on a website, but didn’t really want.
So, how are we doing? Well, back in 2006 our spam problem was pretty bad. True spam was approaching 35%. That means that about one out of every three messages in your inbox was spam. Since then we’ve made tremendous progress—getting true spam under 5% and keeping it there. The graphs below show the spam trends on the Internet in general and in Hotmail over the last several years. The green triangles in the Hotmail graph show the points in time when we released new spam-fighting technology.
You can see that while the spam problem got worse on the Internet overall, Hotmail’s investments really paid off. We are now seeing historical lows not only on spam, but on our rate of false positives.
We’ve achieved our results through tremendous investments in our SmartScreen technology. Let’s talk about some of the weapons we’ve brought to bear in this fight.
Connection-time filtering. This is our first line of defense. At any point in time, our spam-fighting system has a view of the world of e-mail senders based on various sources of reputation, as well as recent trends in e-mail content. Sender reputation is largely tied to IP addresses or ranges of addresses. We use this information to set limits on how many messages any given sender can deliver to Hotmail. Setting the limit to zero effectively blocks all e-mail from that sender. For good senders we set the limit such that normal e-mail delivery is easily accommodated while minimizing the potential for abuse should the senders’ own computers get hacked. We use several sources to generate sender reputation:
Content filters. We have numerous filters that we run against incoming e-mail that can identify the e-mail as spam by analyzing the content. This isn’t as simple as just searching for phrases like “watch replica.” Our SmartScreen technology uses machine learning to adapt to the trends and techniques used by spammers. The filtering system applies suitably tailored combinations of policy, content filtering, and reputation filtering to different classes of senders. We complement our own filters with third-party filters to increase the overall effectiveness. The filters identify spam with various levels of confidence. When we are absolutely certain that a message is spam, we delete it; otherwise, we put it in your Junk folder. Our content filters take out about one billion spam messages every day.
Your preferences. You control spam, too! You can set up your own blocklist and safelist and rules, and we use all of that information to do additional filtering of spam.
Time-traveling filters. You read that right. We can time travel. Well, at least our filters can. It’s actually pretty simple. We can’t always learn about a new spammer the moment they start sending spam. But once we do identify a spammer we can go back and clean out spam that they’ve already sent to your Inbox before you ever see it. We call those tools time-traveling filters, because in some sense we are able to go back in time and get rid of the spam even after we delivered it! (Of course, if you’ve already seen the spam, we can’t remove it. That, as we all know, would create a time-traveling paradox, which might destroy the universe.)
Malware detection. We scan attachments and block those that contain known malware and viruses.
Tools in the Hotmail UI. Finally, we put powerful spam tools right in the user interface. We display a safety bar when you’re reading messages that warns you of potential danger based on the sender’s reputation. Links and images are turned off by default for unknown or untrustworthy senders in order to protect you from evil links and web beacons. You can help us identify spam by marking bad messages as Junk or moving messages to your Junk folder. And you can help us minimize false positives by moving legitimate messages back out of the Junk folder. Every time you mark a message as Junk (or Not Junk, as the case may be), our system gets smarter.
Our technology is only part of the solution. We rely on feedback from our users to help in the war on spam. Here are a few ways you can help make our system smarter as well as contribute to the general health of the e-mail ecosystem.
Over the last several years the Hotmail team has made critical investments in SmartScreen technologies to not only contain the spam problem, but to get ahead and stay ahead of the industry. And we’re not done—we continue to battle spam and other kinds of abuse with each release of the service.
In the next post, I’ll talk a bit more about the graymail problem, go a bit deeper on spam filtering, and offer some more tips for getting rid of spam for those of you who are still experiencing a spam problem.
Until then, I hope you’ll continue to use Hotmail and keep the comments and feedback coming.
Dick Craddock Group Program Manager, Windows Live Hotmail
Yes, seems like Hotmail has been letting quite a lot of spam mail get through lately. Just today I received several in my Inbox. It's not excessive, but still, needs to be looked at.
@kristopher - This is a related problem we call outbound spam, where spammers create accounts on Hotmail (or other email proviers) and then use them to send spam to other networks. We pay attention to this as well, and the mitigations are different. We'll provide more detail in a separate post.
Okay, I just wrote a reply here, and the site had some error. How frustrating. Let me try again...
I never used to get spam that often. Maybe a few times a week max.
Now I'm getting several a day.
It's always a hotmail.com or live.com address too.
The address is always the person's last name, followed by random characters, and then a year followed by @hotmail.com or live.com.
It's getting really annoying.
I used to only get maybe a couple of spam emails per week. However, the last 2-3 weeks I have been receiving a ton of junk mail. All the addresses are hotmail.com addresses too, which is not typical.
They are all similar too.
Here is an example:
Name: Jeanette Bruyere
Name: Darla Corino
So as you can see, the email address is always their last name, followed by random letters, and a year. Sometimes it's live.com addresses too.