Learning from Imbalanced Data in Presence of Noisy and Borderline Examples
In this paper we studied re-sampling methods for learning classifiers from imbalanced data. We carried out a series of experiments on artificial data sets to explore the impact of noisy and borderline examples from the minority class on the classifier performance. Results showed that if data was suf...
Saved in:
| Published in | Rough Sets and Current Trends in Computing pp. 158 - 167 |
|---|---|
| Main Authors | , , |
| Format | Book Chapter |
| Language | English |
| Published |
Berlin, Heidelberg
Springer Berlin Heidelberg
2010
|
| Series | Lecture Notes in Computer Science |
| Subjects | |
| Online Access | Get full text |
| ISBN | 9783642135286 3642135285 |
| ISSN | 0302-9743 1611-3349 |
| DOI | 10.1007/978-3-642-13529-3_18 |
Cover
| Summary: | In this paper we studied re-sampling methods for learning classifiers from imbalanced data. We carried out a series of experiments on artificial data sets to explore the impact of noisy and borderline examples from the minority class on the classifier performance. Results showed that if data was sufficiently disturbed by these factors, then the focused re-sampling methods – NCR and our SPIDER2 – strongly outperformed the oversampling methods. They were also better for real-life data, where PCA visualizations suggested possible existence of noisy examples and large overlapping ares between classes. |
|---|---|
| ISBN: | 9783642135286 3642135285 |
| ISSN: | 0302-9743 1611-3349 |
| DOI: | 10.1007/978-3-642-13529-3_18 |