مشخصات پژوهش

صفحه نخست /A Weak-Region Enhanced ...
عنوان A Weak-Region Enhanced Bayesian Classification for Spam Content-Based Filtering
نوع پژوهش مقاله چاپ‌شده
کلیدواژه‌ها Bayesian, feature selection, spam detection, text classification
چکیده This article proposes an improved Bayesian scheme by focusing on the region in which Bayesian may fail to correctly identify labels and improve classification performance by handling those errors. Bayesian method, as a probabilistic classifier, uses Bayes’ theorem to calculate the probability of an instance belonging to a class, where the class label with a maximum probability is assigned to the instance. In a spam detection problem, it can be considered that the prediction of the Bayesian classifier is weak when the probability obtained for classes spam and non-spam are close to each other. Therefore, we define a threshold to determine weak prediction against strong prediction. A hybrid strategy using a two-layer Bayesian approach is presented: basic Bayesian (BBayes) and corrected weak region Bayesian (CWRBayes), which are concerned with strong and weak predictions, respectively. Both techniques, BBayes and CWRBayes, have the same classification mechanism, but they use different feature selection mechanisms. The proposed methods are implemented and evaluated over two datasets of spam e-mails, and the results show that the proposed method has better performance than the baseline of the naïve Bayesian and some other Bayesian variants.
پژوهشگران محسن رحمانی (نفر دوم)، وحید نصرتی (نفر اول)