This paper is published in Volume-4, Issue-4, 2018
Area
Natural Language Processing
Author
Sujesh Shankar
Org/Univ
Vellore Institute of Technology, Vellore, Tamil Nadu, India
Keywords
Natural Language Processing (NLP), Spam detection, Online security, Spam filtering
Citations
IEEE
Sujesh Shankar. Advanced detection of spam and email filtering using natural language processing algorithms, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.
APA
Sujesh Shankar (2018). Advanced detection of spam and email filtering using natural language processing algorithms. International Journal of Advance Research, Ideas and Innovations in Technology, 4(4) www.IJARIIT.com.
MLA
Sujesh Shankar. "Advanced detection of spam and email filtering using natural language processing algorithms." International Journal of Advance Research, Ideas and Innovations in Technology 4.4 (2018). www.IJARIIT.com.
Sujesh Shankar. Advanced detection of spam and email filtering using natural language processing algorithms, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.
APA
Sujesh Shankar (2018). Advanced detection of spam and email filtering using natural language processing algorithms. International Journal of Advance Research, Ideas and Innovations in Technology, 4(4) www.IJARIIT.com.
MLA
Sujesh Shankar. "Advanced detection of spam and email filtering using natural language processing algorithms." International Journal of Advance Research, Ideas and Innovations in Technology 4.4 (2018). www.IJARIIT.com.
Abstract
Unsolicited bulk emails from random email addresses sent to a user's inbox are generally called junk or spam emails. 45% of all emails sent are spam and 14.5 billion spam emails are sent every single day. Around 36% of spam emails is content related to sales, advertising, and promotions that the recipient explicitly did not opt to receive. However, not all spam emails are used for this purpose. Spam emails are also sent for phishing purposes that deceive users and lead the recipients to malicious websites with unethical intentions. Numerous techniques have been developed to block such spam emails but a majority of users still receive them. This is because of the ability of the spammers to manipulate the filters. Spam costs businesses a whopping $20.5 billion every year. Even worse is that the cost of spam is likely to continue rising. Data indicates that losses to business will grow to $257 billion annually within a few years if the current rate of spam email is not decreased. To curb this problem, we present a method based on Natural Language Processing (NLP) for the filtration of spam emails in order to enhance online security. The technique proposed in this research paper is an approach which stepwise blocks spam mail based on the sender's email address along with the content of the email. This paper presents a proposed NLP system using N-gram model, Word Stemming algorithm and Bayesian Classification algorithm for detection of spam content and effectively filtering it.