Bayesian spam classification: Time efficient radix encoded fragmented database approach

Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to deal with it, have been presented here. Statistical classifiers are one such group of methods that show adequate performance in filtering spa...

Full description

Saved in:
Bibliographic Details
Published inINDIACom : 2014 International Conference on Computing for Sustainable Global Development : 5-7 March 2014 pp. 939 - 942
Main Authors Jatana, Nishtha, Sharma, Kapil
Format Conference Proceeding
LanguageEnglish
Published Bharati Vidyapeeth University 01.03.2014
Subjects
Online AccessGet full text
ISBN9380544103
9789380544106
DOI10.1109/IndiaCom.2014.6828102

Cover

Abstract Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to deal with it, have been presented here. Statistical classifiers are one such group of methods that show adequate performance in filtering spam, based upon the previous knowledge gathered through collected and classified emails. Learning algorithms that uses the Naive Bayesian classifier have shown promising results in separating spam from legitimate mail. An encoded and fragmented database approach that resembles radix sort technique has been proposed and applied for first time to improve Paul Graham's Naive Bayes machine learning algorithm for spam filtering. The main objective of this paper is to reduce overall time in the process of spam detection. Quantitative and qualitative analysis of the proposed technique, performed on two public spam databases (SpamAssasin and Ling Spam) has shown improved time performance. The proposed method has performed up to six times faster than the existing Paul Graham's Bayesian approach.
AbstractList Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to deal with it, have been presented here. Statistical classifiers are one such group of methods that show adequate performance in filtering spam, based upon the previous knowledge gathered through collected and classified emails. Learning algorithms that uses the Naive Bayesian classifier have shown promising results in separating spam from legitimate mail. An encoded and fragmented database approach that resembles radix sort technique has been proposed and applied for first time to improve Paul Graham's Naive Bayes machine learning algorithm for spam filtering. The main objective of this paper is to reduce overall time in the process of spam detection. Quantitative and qualitative analysis of the proposed technique, performed on two public spam databases (SpamAssasin and Ling Spam) has shown improved time performance. The proposed method has performed up to six times faster than the existing Paul Graham's Bayesian approach.
Author Jatana, Nishtha
Sharma, Kapil
Author_xml – sequence: 1
  givenname: Nishtha
  surname: Jatana
  fullname: Jatana, Nishtha
  organization: Dept. of Comput. Sci. & Eng., Maharaja Surajmal Inst. of Technol., Delhi, India
– sequence: 2
  givenname: Kapil
  surname: Sharma
  fullname: Sharma, Kapil
  organization: Dept. of Comput. Sci. & Eng., Delhi Technol. Univ., New Delhi, India
BookMark eNpFkE9LAzEUxCMqaGs_gQj5AlvzNn92402XqoWCl4rH8pq8aKSbXTZ7sN_eBQueZuZ3GIaZsYvUJWLsDsQSQNj7dfIRm65dlgLU0tRlDaI8YzMra6GVAoDz_yDkFVvk_C2EAGuqGuCafTzhkXLExHOPLXcHzDmG6HCMXXrg29gSpzCBSGnkA_r4wym5zpPnYcDPdsKT9TjiHjNx7PuhQ_d1wy4DHjItTjpn78-rbfNabN5e1s3jpohQ6bEoQamgPU3bjJAOpfE6aOWCqvZEEitDwRgPXpFHq6Wxtna1t0IFQgQj5-z2rzcS0a4fYovDcXc6Qv4Cp1JWfw
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/IndiaCom.2014.6828102
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Economics
EISBN 9380544111
9789380544120
938054412X
9789380544113
EndPage 942
ExternalDocumentID 6828102
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ADFMO
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
IERZE
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-2144f5de103603ca36d5f54cf47bee3a76ef66d1d4eda9536998c8d904feaa163
IEDL.DBID RIE
ISBN 9380544103
9789380544106
IngestDate Wed Aug 27 04:57:45 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-2144f5de103603ca36d5f54cf47bee3a76ef66d1d4eda9536998c8d904feaa163
PageCount 4
ParticipantIDs ieee_primary_6828102
PublicationCentury 2000
PublicationDate 2014-March
PublicationDateYYYYMMDD 2014-03-01
PublicationDate_xml – month: 03
  year: 2014
  text: 2014-March
PublicationDecade 2010
PublicationTitle INDIACom : 2014 International Conference on Computing for Sustainable Global Development : 5-7 March 2014
PublicationTitleAbbrev IndiaCom
PublicationYear 2014
Publisher Bharati Vidyapeeth University
Publisher_xml – name: Bharati Vidyapeeth University
SSID ssj0001967811
Score 1.6125108
Snippet Spam or unsolicited email has become a major problem for companies and private users. The problems associated with spam and various approaches that attempt to...
SourceID ieee
SourceType Publisher
StartPage 939
SubjectTerms Bayes methods
Bayesian
Filtering
formatting
insert
Postal services
Probability
Spam
style
styling
Tokenization
Training
Unsolicited electronic mail
Title Bayesian spam classification: Time efficient radix encoded fragmented database approach
URI https://ieeexplore.ieee.org/document/6828102
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKF5h4tIi3PDCSNKkfiRlBVIAEYqCiW-XYZ1QhUlRSCfj1-Jy0FYiBLc5wss6J787-vu8IOe3nkqu-8UWOFkXEC8YiBX0ZZcq4xKEkOQSU7728HvLbkRi1yNmSCwMAAXwGMT6Gu3w7NXM8KutJXx6kqBy5luWy5mqtzlOURNKkL7wUyxPsrJWwWmBnOZYNgydNVO-m9O73vxyCu3jcGP7RYSUEmMEmuVtMrcaVvMTzqojN1y_Vxv_OfYt0V1Q--rAMUtukBeUOWV_Qkd875OlCfwIyKanfW16pwWwa4UNhxc4pUkQoBJ0Jb5_OtJ18UBS_tGCpm-nnIOppKSJNMSLShUh5lwwHV4-X11HTbSGa-BSiilA7zQkL3m8yYUYzaYUT3DieFQBMZxKclDa1HKzGS19fqJncqoQ70NqndbukXU5L2CM0FTbhlmtjC8G1cDrXShXgraXOgOH7pIMOGr_VghrjxjcHf78-JBu4SDXw64i0q9kcjn0mUBUn4RP4BvVEsPg
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEN0QPODJDzB-uwePtrR0d-l61EhAgXiAyI1sd2cNMRaDJVF_vTttgWg8eGt7mGxm2s7M7ntvCLlsxYLJlnZNjuKJx5Io8iS0hNeW2gYWJckhR_kORXfM7id8UiFXay4MAOTgM_DxMj_LN3O9xK2ypnDtQYjKkVucMcYLttZmR0UKpE261ktGcYCztYKokNhZ34uSwxMGstlLXQDcR4fwLuaXpn_MWMlTTGeHDFaLK5AlL_4yS3z99Uu38b-r3yWNDZmPPq7T1B6pQLpPaitC8nudPN2oT0AuJXV_l1eqsZ5GAFEes2uKJBEKudKEs08Xysw-KMpfGjDULtRzLutpKGJNMSfSlUx5g4w7d6PbrlfOW_BmrojIPFRPs9yA85sIIq0iYbjlTFvWTgAi1RZghTChYWAUHvu6Vk3HRgbMglKusDsg1XSewiGhITcBM0xpk3CmuFWxkjIBZy20GjQ7InV00PStkNSYlr45_vvxBal1R4P-tN8bPpyQbQxYAQM7JdVssYQzVxdkyXn-OnwDaVC0RQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=INDIACom+%3A+2014+International+Conference+on+Computing+for+Sustainable+Global+Development+%3A+5-7+March+2014&rft.atitle=Bayesian+spam+classification%3A+Time+efficient+radix+encoded+fragmented+database+approach&rft.au=Jatana%2C+Nishtha&rft.au=Sharma%2C+Kapil&rft.date=2014-03-01&rft.pub=Bharati+Vidyapeeth+University&rft.isbn=9789380544106&rft.spage=939&rft.epage=942&rft_id=info:doi/10.1109%2FIndiaCom.2014.6828102&rft.externalDocID=6828102
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9789380544106/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9789380544106/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9789380544106/sc.gif&client=summon&freeimage=true