cg.gif : : SPAM - eMail Address Protection Study : : cd.gif

eMail Address Protection Study, Damien Giry and Michael Neve

spam1.gifEven though emails only represent a tiny fraction of the traffic volume going around public IP networks (a little more than 1%), it involves a gigantic amount of messages: around 31 billion worldwide per day in 2002 and perspectives claim more than 60 billion for 2006. However this should be carefully considered: other sources give higher numbers (about 220 billion for 2004). Email presently stands out as the most popular communication mean ever. All those figures would not cause any problem but for spams: unsolicited commercial email (UCE), also known as unsolicited bulk email (UBE) or junk mail.

The meaning of spam has changed with time. At the origin, a message was called spam when widely broadcasted all over newsgroups. But later junk emails were labeled as the Hormel Foods' cans (SPiced hAM or Shoulder Pork and hAM), in reference to the Monty Python's Flying Circus sketch, where everything on a menu contained spam, wanted or not.

The paper follows the structure of the project, where three phases were set up. The first one focuses on a passive user behavior, as if all received emails were read offline. Web sites are created with various domain names while each page contains a set of email addresses with different levels of protection. Those pages and sites constitute honey pots and all emails sent to the domain addresses are received and stored for further analysis. The notoriety of a site is tested as well. Section 2 details the structures of this phase while, in Section 3, main conclusions are drawn after 3 months. Some hypotheses are also proposed, especially concerning the philosophy of the email address's gathering.

spam2.gifPhase 2 permits testing those hypotheses with a set of new experimentations. Therefore Section 4 places the emphasis on active profiles by modifying particular sites' properties, while other sites are kept as in the first phase for comparison matters. Other behaviors are tested as well: for example, the potential influence of posting in newsgroups or using concrete applications involving validated addresses. Section 5 hands out conclusions at this level.

The third phase is described in Section 6, wherein other influences are measured, particularly concerning the spywares that might infest a computer. Also, some addresses from the phase 1 are removed and replaced by new ones. Conclusions are given in Section 7.

Eventually, Section 8 gives the global conclusion of this study, a seven-month long snapshot of spam activity on real systems.

Download the full version: PDF or PS.

cg.gif : : SPAM - eMail Address Protection Study : : cd.gif