Spam detection using data compression and signatures

Loading...
Thumbnail Image

Downloads

0

Date issued

Journal Title

Journal ISSN

Volume Title

Publisher

Taylor & Francis

Location

Signature

Abstract

In this article, we introduce a novel method for spam detection based on a combination of Bayesian filtering, signature trees, and data compression–based similarity. Bayesian filtering is one of the most popular and most efficient algorithms for dealing with spam detection. The problem with Bayesian filtering is that it is unable to classify any e-mail without doubt and sometimes spam e-mails are classified as regular e-mails. This novel method sorts out this problem by using signature trees and data compression–based similarity. The main result of this article is an up to 99% improvement in spam detection precision using this novel method.

Description

Subject(s)

Bayesian filter, data compression, e-mail, S-tree, signatures, spam

Citation

Cybernetics and Systems. 2013, vol. 44, issue 6-7, p. 533-549.