Chapter 5 Conclusion and Future Work

Identifying and reducing the volumes of outbound spam in high volume e-mail system is a worthwhile endeavour. The operators of the system benefit in improved reputation, as do the e-mail users who eventually receive less spam.

This work demonstrates that in commercial e-mail systems with a wide range of users, simple metrics such as volume alone is insufficient to determine whether a sender is a spammer. The dataset shows that there are ample senders distributing a high volume of e-mail in their legitimate business activities, or in mailing lists that would be trapped by volumetric analysis. Similarly, volumetric analysis would fail to detect low volume spammers.

Building a complete view of a sender’s activities across all servers that are part of an outbound e-mail system is of value as it allows analysis of complex behaviours.

With this complete information it is possible to trace the SMTP reply codes issued from foreign servers back to individual senders. The results support the hypothesis that the existence of a large percentage of rejected messages in a sender’s outbound e-mail flows is a good metric for identifying them as a spammer. The results further suggest that for every e-mail system, a cut-off value exists that could greatly reduce outbound spam with a low rate of false positives.

5.1. Future Work

It would be interesting to include further metrics when analysing outbound e-mail flows.

When selecting further metrics, the data source for these metrics must be taken into consideration. This work has relied on system logs, which restricts analysis to envelope data and very basic message information.

Message size is available in the system logs generated by Postfix and other MTAs. In analysing the data for this thesis, it was observed that message sizes in messages sent to a mailing list from non-spammer senders appeared to be identical or near identical. Slight variances could imply a minor change in the message for personalization, such as placing the recipient’s name in the message body. However, some of the messages spammers sent to what appeared to be a mailing list varied greatly in size. The question is whether the variance is due to message padding – a countermeasure to content analysis – or if an entirely different payload is being sent. It may be that for a given user, messages sent within a short period of time with a large variance in individual message size is an indication of spammer activity. Validating this metric without the message content would be difficult, as it would be important to verify that sequential messages carry a similar body or intent.

Message content is not checked in the analysis done in this work. It would be interesting to run automated tests, such as applying the anti-spam or anti-virus software to outbound flows. This idea has been suggested in the past, and has often been rejected for being too computationally expensive. However the cost of processing power continues to decrease, which could make this viable. Or perhaps content analysis could be used to validate the results of other metrics, such as the recipient system SMTP response code used in this work. The computational cost may be acceptable if used in cases of uncertainty alone. For this work, uncertainty represents senders with over 2000 messages per week having over 10% rejection. Applying content scanning to this corpus would restrict this computationally expensive activity to 9% of the total outbound flow. The benefit would be the potential removal of up to 65% of outbound spam – a 20% increase.