
ISBN: 0542155788
Cost: $69.99
Author(s):
Praveen R. Nerellapalli

|
Adaptive anti-spam e-mail filtering using Huffman coding and statistical learning
"Unsolicited bulk e-mail, also known as spam, is an increasing problem for the society. This thesis presents a new anti-spam filtering strategy that (1) uses a practical entropy coding technique, called Huffman coding, to adaptively encode the feature space of an e-mail collection and, (2) applies logistic regression to fit a binary classification model to the collected data. We compared our technique to Naïve Bayes and K-Nearest Neighbor, and demonstrated the effectiveness of our technique by presenting the experimental results on the e-mail data that is publicly available. We also investigated the effect of skewed spam distributions on the performance of various techniques. Our contributions include a novel method for anti-spam filtering that is both effective and practical, and an adaptive learning model that may be applied in the area of information retrieval."
...Continue reading at Amazon.com
|