Bayesian spam classification: Time efficient radix encoded fragmented database approach
Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to deal with it, have been presented here. Statistical classifiers are one such group of methods that show adequate performance in filtering spa...
Saved in:
| Published in | INDIACom : 2014 International Conference on Computing for Sustainable Global Development : 5-7 March 2014 pp. 939 - 942 |
|---|---|
| Main Authors | , |
| Format | Conference Proceeding |
| Language | English |
| Published |
Bharati Vidyapeeth University
01.03.2014
|
| Subjects | |
| Online Access | Get full text |
| ISBN | 9380544103 9789380544106 |
| DOI | 10.1109/IndiaCom.2014.6828102 |
Cover
| Abstract | Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to deal with it, have been presented here. Statistical classifiers are one such group of methods that show adequate performance in filtering spam, based upon the previous knowledge gathered through collected and classified emails. Learning algorithms that uses the Naive Bayesian classifier have shown promising results in separating spam from legitimate mail. An encoded and fragmented database approach that resembles radix sort technique has been proposed and applied for first time to improve Paul Graham's Naive Bayes machine learning algorithm for spam filtering. The main objective of this paper is to reduce overall time in the process of spam detection. Quantitative and qualitative analysis of the proposed technique, performed on two public spam databases (SpamAssasin and Ling Spam) has shown improved time performance. The proposed method has performed up to six times faster than the existing Paul Graham's Bayesian approach. |
|---|---|
| AbstractList | Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to deal with it, have been presented here. Statistical classifiers are one such group of methods that show adequate performance in filtering spam, based upon the previous knowledge gathered through collected and classified emails. Learning algorithms that uses the Naive Bayesian classifier have shown promising results in separating spam from legitimate mail. An encoded and fragmented database approach that resembles radix sort technique has been proposed and applied for first time to improve Paul Graham's Naive Bayes machine learning algorithm for spam filtering. The main objective of this paper is to reduce overall time in the process of spam detection. Quantitative and qualitative analysis of the proposed technique, performed on two public spam databases (SpamAssasin and Ling Spam) has shown improved time performance. The proposed method has performed up to six times faster than the existing Paul Graham's Bayesian approach. |
| Author | Jatana, Nishtha Sharma, Kapil |
| Author_xml | – sequence: 1 givenname: Nishtha surname: Jatana fullname: Jatana, Nishtha organization: Dept. of Comput. Sci. & Eng., Maharaja Surajmal Inst. of Technol., Delhi, India – sequence: 2 givenname: Kapil surname: Sharma fullname: Sharma, Kapil organization: Dept. of Comput. Sci. & Eng., Delhi Technol. Univ., New Delhi, India |
| BookMark | eNpFkE9LAzEUxCMqaGs_gQj5AlvzNn92402XqoWCl4rH8pq8aKSbXTZ7sN_eBQueZuZ3GIaZsYvUJWLsDsQSQNj7dfIRm65dlgLU0tRlDaI8YzMra6GVAoDz_yDkFVvk_C2EAGuqGuCafTzhkXLExHOPLXcHzDmG6HCMXXrg29gSpzCBSGnkA_r4wym5zpPnYcDPdsKT9TjiHjNx7PuhQ_d1wy4DHjItTjpn78-rbfNabN5e1s3jpohQ6bEoQamgPU3bjJAOpfE6aOWCqvZEEitDwRgPXpFHq6Wxtna1t0IFQgQj5-z2rzcS0a4fYovDcXc6Qv4Cp1JWfw |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/IndiaCom.2014.6828102 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Economics |
| EISBN | 9380544111 9789380544120 938054412X 9789380544113 |
| EndPage | 942 |
| ExternalDocumentID | 6828102 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ADFMO ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK IERZE OCL RIE RIL |
| ID | FETCH-LOGICAL-i175t-2144f5de103603ca36d5f54cf47bee3a76ef66d1d4eda9536998c8d904feaa163 |
| IEDL.DBID | RIE |
| ISBN | 9380544103 9789380544106 |
| IngestDate | Wed Aug 27 04:57:45 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i175t-2144f5de103603ca36d5f54cf47bee3a76ef66d1d4eda9536998c8d904feaa163 |
| PageCount | 4 |
| ParticipantIDs | ieee_primary_6828102 |
| PublicationCentury | 2000 |
| PublicationDate | 2014-March |
| PublicationDateYYYYMMDD | 2014-03-01 |
| PublicationDate_xml | – month: 03 year: 2014 text: 2014-March |
| PublicationDecade | 2010 |
| PublicationTitle | INDIACom : 2014 International Conference on Computing for Sustainable Global Development : 5-7 March 2014 |
| PublicationTitleAbbrev | IndiaCom |
| PublicationYear | 2014 |
| Publisher | Bharati Vidyapeeth University |
| Publisher_xml | – name: Bharati Vidyapeeth University |
| SSID | ssj0001967811 |
| Score | 1.6125108 |
| Snippet | Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 939 |
| SubjectTerms | Bayes methods Bayesian Filtering formatting insert Postal services Probability Spam style styling Tokenization Training Unsolicited electronic mail |
| Title | Bayesian spam classification: Time efficient radix encoded fragmented database approach |
| URI | https://ieeexplore.ieee.org/document/6828102 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKF5h4tIi3PDCSNKkfiRlBVIAEYqCiW-XYZ1QhUlRSCfj1-Jy0FYiBLc5wss6J787-vu8IOe3nkqu-8UWOFkXEC8YiBX0ZZcq4xKEkOQSU7728HvLbkRi1yNmSCwMAAXwGMT6Gu3w7NXM8KutJXx6kqBy5luWy5mqtzlOURNKkL7wUyxPsrJWwWmBnOZYNgydNVO-m9O73vxyCu3jcGP7RYSUEmMEmuVtMrcaVvMTzqojN1y_Vxv_OfYt0V1Q--rAMUtukBeUOWV_Qkd875OlCfwIyKanfW16pwWwa4UNhxc4pUkQoBJ0Jb5_OtJ18UBS_tGCpm-nnIOppKSJNMSLShUh5lwwHV4-X11HTbSGa-BSiilA7zQkL3m8yYUYzaYUT3DieFQBMZxKclDa1HKzGS19fqJncqoQ70NqndbukXU5L2CM0FTbhlmtjC8G1cDrXShXgraXOgOH7pIMOGr_VghrjxjcHf78-JBu4SDXw64i0q9kcjn0mUBUn4RP4BvVEsPg |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEN0QPODJDzB-uwePtrR0d-l61EhAgXiAyI1sd2cNMRaDJVF_vTttgWg8eGt7mGxm2s7M7ntvCLlsxYLJlnZNjuKJx5Io8iS0hNeW2gYWJckhR_kORXfM7id8UiFXay4MAOTgM_DxMj_LN3O9xK2ypnDtQYjKkVucMcYLttZmR0UKpE261ktGcYCztYKokNhZ34uSwxMGstlLXQDcR4fwLuaXpn_MWMlTTGeHDFaLK5AlL_4yS3z99Uu38b-r3yWNDZmPPq7T1B6pQLpPaitC8nudPN2oT0AuJXV_l1eqsZ5GAFEes2uKJBEKudKEs08Xysw-KMpfGjDULtRzLutpKGJNMSfSlUx5g4w7d6PbrlfOW_BmrojIPFRPs9yA85sIIq0iYbjlTFvWTgAi1RZghTChYWAUHvu6Vk3HRgbMglKusDsg1XSewiGhITcBM0xpk3CmuFWxkjIBZy20GjQ7InV00PStkNSYlr45_vvxBal1R4P-tN8bPpyQbQxYAQM7JdVssYQzVxdkyXn-OnwDaVC0RQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=INDIACom+%3A+2014+International+Conference+on+Computing+for+Sustainable+Global+Development+%3A+5-7+March+2014&rft.atitle=Bayesian+spam+classification%3A+Time+efficient+radix+encoded+fragmented+database+approach&rft.au=Jatana%2C+Nishtha&rft.au=Sharma%2C+Kapil&rft.date=2014-03-01&rft.pub=Bharati+Vidyapeeth+University&rft.isbn=9789380544106&rft.spage=939&rft.epage=942&rft_id=info:doi/10.1109%2FIndiaCom.2014.6828102&rft.externalDocID=6828102 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9789380544106/lc.gif&client=summon&freeimage=true |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9789380544106/mc.gif&client=summon&freeimage=true |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9789380544106/sc.gif&client=summon&freeimage=true |