基于混合采样的非平衡数据集分类研究
针对传统的过采样算法在增加样本的同时可能使决策域变小和噪声点增加的问题进行了研究,提出了一种基于错分的混合采样算法。该算法是以SVM为元分类器,Ada Boost算法进行迭代,对每次错分的样本点根据其空间近邻关系,采取一种改进的混合采样策略:对噪声样本直接删除;对危险样本约除其近邻中的正类样本;对安全样本则采用SMOTE算法合成新样本并加入到新的训练集中重新训练学习。在实际数据集上进行实验,并与SMOTE-SVM和Ada Boost-SVM-OBMS算法进行比较,实验结果表明该算法能够有效地提高负类的分类准确率。...
        Saved in:
      
    
          | Published in | 计算机应用研究 Vol. 32; no. 2; pp. 379 - 381 | 
|---|---|
| Main Author | |
| Format | Journal Article | 
| Language | Chinese | 
| Published | 
            重庆大学 计算机学院,重庆,400030
    
        2015
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1001-3695 | 
| DOI | 10.3969/j.issn.1001-3695.2015.02.014 | 
Cover
| Summary: | 针对传统的过采样算法在增加样本的同时可能使决策域变小和噪声点增加的问题进行了研究,提出了一种基于错分的混合采样算法。该算法是以SVM为元分类器,Ada Boost算法进行迭代,对每次错分的样本点根据其空间近邻关系,采取一种改进的混合采样策略:对噪声样本直接删除;对危险样本约除其近邻中的正类样本;对安全样本则采用SMOTE算法合成新样本并加入到新的训练集中重新训练学习。在实际数据集上进行实验,并与SMOTE-SVM和Ada Boost-SVM-OBMS算法进行比较,实验结果表明该算法能够有效地提高负类的分类准确率。 | 
|---|---|
| Bibliography: | 51-1196/TP GU Ping , OU YANG Yuan-you ( College of Computer Science, Chongqing University, Chongqing 400030, China) mixed-sampling; misclassified samples; unbalanced data ; AdaBoost algorithm ; SVM algorithm To solve the problem that traditional over-sampling algorithms may cause the decision-making domain becomes smaller and the noise point increases while sample was being increased, this paper presented a mixed-sampling algorithm based on misclassified samples. This approach used support vector machine be as base classifier and the misclassified samples be identified during each iteration, according to their spatial relationship between neighbors of each misclassified samples, it took an improved mixed-sampling strategy:remove this directly to the noise samples and exclude positive class samples in the neighbors to the dangerous samples, while, to security samples, compose new samples by SMOTE algorithm, then added to the original training set to retrain the classification model. Compared with SMOTE-SVM algori  | 
| ISSN: | 1001-3695 | 
| DOI: | 10.3969/j.issn.1001-3695.2015.02.014 |