2024 : 12 : 27
Mohsen Rahmani

Mohsen Rahmani

Academic rank: Associate Professor
ORCID: https://orcid.org/0000-0001-6890-192X
Education: PhD.
ScopusId: 37061814300
HIndex:
Faculty: Engineering
Address: Arak University
Phone:

Research

Title
A Weak-Region Enhanced Bayesian Classification for Spam Content-Based Filtering
Type
JournalPaper
Keywords
Bayesian, feature selection, spam detection, text classification
Year
2023
Journal ACM Transactions on Asian and Low-Resource Language Information Processing
DOI
Researchers Vahid Nosrati ، Mohsen Rahmani

Abstract

This article proposes an improved Bayesian scheme by focusing on the region in which Bayesian may fail to correctly identify labels and improve classification performance by handling those errors. Bayesian method, as a probabilistic classifier, uses Bayes’ theorem to calculate the probability of an instance belonging to a class, where the class label with a maximum probability is assigned to the instance. In a spam detection problem, it can be considered that the prediction of the Bayesian classifier is weak when the probability obtained for classes spam and non-spam are close to each other. Therefore, we define a threshold to determine weak prediction against strong prediction. A hybrid strategy using a two-layer Bayesian approach is presented: basic Bayesian (BBayes) and corrected weak region Bayesian (CWRBayes), which are concerned with strong and weak predictions, respectively. Both techniques, BBayes and CWRBayes, have the same classification mechanism, but they use different feature selection mechanisms. The proposed methods are implemented and evaluated over two datasets of spam e-mails, and the results show that the proposed method has better performance than the baseline of the naïve Bayesian and some other Bayesian variants.