Theoretical and experimental studies on textual antispam filtering

Abstract


Mahathir A. Anwar1, Lee Choo Bernard1, Abdullah H. Victor1and Najib Jimmy Ibrahim2*

Spam is unsolicited bulk messages sent indiscriminately. According to Wikipedia and Cisco report, more than 31 trillion spams have been sent in 2009. These spam or “junk mails” can involve various kinds of messages such as commercial advertising, pornography, viruses, doubtful product, get rich quick scheme or quasi legal services. In this paper, a direct attention has been paid to the text spam, and in particular, the process of text spam and the tricks of the spammers have been described in this paper. Moreover, the author described the implementation of the text content analysis and classification, using different document processing techniques (that is, stop words, short words form, regular expression, stemming etc.) and naive Bayesian classifier. In addition to that, the author has depicted the practical work of the document processing and naive Bayesian classifier towards implementing an accurate anti-spam system.

Share this article

Awards Nomination

Select your language of interest to view the total content in your interested language

Indexed In
  • Index Copernicus
  • Google Scholar
  • Sherpa Romeo
  • Open J Gate
  • Genamics JournalSeek
  • Academic Keys
  • Directory of Open Access Journals
  • CiteFactor
  • Electronic Journals Library
  • OCLC- WorldCat
  • Eurasian Scientific Journal Index
  • JournalGuide
  • Rootindexing
  • Academic Resource Index