A novel multi-source weighted naive Bayes classifier

Several methods have been developed to alleviate the impact of conditional independence assumption on the performance of naive Bayes classifiers (NBCs). Among these, the attribute-weighted NBC has gained considerable attention in recent years. The existing weighted NBC algorithms primarily determine...

Full description

Saved in:
Bibliographic Details
Published inInformation sciences Vol. 721; p. 122568
Main Authors Ou, Gui-Liang, He, Yu-Lin, Fournier-Viger, Philippe, Huang, Joshua Zhexue
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.12.2025
Subjects
Online AccessGet full text
ISSN0020-0255
DOI10.1016/j.ins.2025.122568

Cover

More Information
Summary:Several methods have been developed to alleviate the impact of conditional independence assumption on the performance of naive Bayes classifiers (NBCs). Among these, the attribute-weighted NBC has gained considerable attention in recent years. The existing weighted NBC algorithms primarily determine attribute weights using metrics such as mutual information, entropy, and accuracy ratios. However, they typically ignore the underlying probability and spatial distribution information embedded within attributes and classes. This study proposes a novel multi-source weighted NBC (MS-WNBC) to address this gap and establish a more robust framework for attribute-weighted characterization. The proposed method innovatively computes attribute weights by integrating three sources of information—attribute correlation, probability distribution, and structural characteristics. This multi-source fusion strategy enhances the discriminative quality of attribute weights, thereby systematically reducing classification risk. Experiments to evaluate the performance of MS-WNBC revealed the following: (1) the MS-WNBC improves the classification robustness of NBC by fusing multi-source weights; (2) it significantly reduces the classification risk while exhibiting strong resistance to interference; and (3) it statistically improves the training and testing accuracy, average probability estimation quality, and area under the curve when compared with NBC and its seven variants across 30 benchmark datasets. The study findings indicate that the MS-WNBC, characterized by high structural stability, robust correlation expression capabilities, and excellent generalization performance, is an efficient variant of NBC.
ISSN:0020-0255
DOI:10.1016/j.ins.2025.122568