Bayesian spam classification: Time efficient radix encoded fragmented database approach
Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to deal with it, have been presented here. Statistical classifiers are one such group of methods that show adequate performance in filtering spa...
Saved in:
| Published in | INDIACom : 2014 International Conference on Computing for Sustainable Global Development : 5-7 March 2014 pp. 939 - 942 |
|---|---|
| Main Authors | , |
| Format | Conference Proceeding |
| Language | English |
| Published |
Bharati Vidyapeeth University
01.03.2014
|
| Subjects | |
| Online Access | Get full text |
| ISBN | 9380544103 9789380544106 |
| DOI | 10.1109/IndiaCom.2014.6828102 |
Cover
| Summary: | Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to deal with it, have been presented here. Statistical classifiers are one such group of methods that show adequate performance in filtering spam, based upon the previous knowledge gathered through collected and classified emails. Learning algorithms that uses the Naive Bayesian classifier have shown promising results in separating spam from legitimate mail. An encoded and fragmented database approach that resembles radix sort technique has been proposed and applied for first time to improve Paul Graham's Naive Bayes machine learning algorithm for spam filtering. The main objective of this paper is to reduce overall time in the process of spam detection. Quantitative and qualitative analysis of the proposed technique, performed on two public spam databases (SpamAssasin and Ling Spam) has shown improved time performance. The proposed method has performed up to six times faster than the existing Paul Graham's Bayesian approach. |
|---|---|
| ISBN: | 9380544103 9789380544106 |
| DOI: | 10.1109/IndiaCom.2014.6828102 |