Noisy data elimination using mutual k-nearest neighbor for classification mining

► A new lazy learning algorithm, named MkNNC, is designed for pattern classification. The MkNNC is an instance-based learning method. Its core idea is relatively intuitional and easy to implement. Meanwhile, it is more robust as encounter noises or inconsistent data. ► Anomalies will be firstly dete...

Full description

Saved in:

Bibliographic Details
Published in	The Journal of systems and software Vol. 85; no. 5; pp. 1067 - 1074
Main Authors	Liu, Huawen, Zhang, Shichao
Format	Journal Article
Language	English
Published	New York Elsevier Inc 01.05.2012 Elsevier Sequoia S.A
Subjects	Algorithms Anomalies Artificial intelligence Classification Computer programs Data mining Data reduction kNN Learning Mutual nearest neighbor Noise Pattern classification Pattern recognition Software Studies Systems design kNN Data mining Mutual nearest neighbor Pattern classification Data reduction
Online Access	Get full text
ISSN	0164-1212 1873-1228
DOI	10.1016/j.jss.2011.12.019

Cover

Abstract	► A new lazy learning algorithm, named MkNNC, is designed for pattern classification. The MkNNC is an instance-based learning method. Its core idea is relatively intuitional and easy to implement. Meanwhile, it is more robust as encounter noises or inconsistent data. ► Anomalies will be firstly detected and removed from databases by the mutual nearest neighbors before constructing classification models. Consequently, the information of noise data will not be taken as determinant conditions during the learning process. Thus, the final prediction results are more creditable. ► The MkNNC involves classification learning and anomaly detection and elimination. Both of them are fulfilled with MNN, which carries more useful and reliable information than kNN in determining the relationship between instances. k nearest neighbor (kNN) is an effective and powerful lazy learning algorithm, notwithstanding its easy-to-implement. However, its performance heavily relies on the quality of training data. Due to many complex real-applications, noises coming from various possible sources are often prevalent in large scale databases. How to eliminate anomalies and improve the quality of data is still a challenge. To alleviate this problem, in this paper we propose a new anomaly removal and learning algorithm under the framework of kNN. The primary characteristic of our method is that the evidence of removing anomalies and predicting class labels of unseen instances is mutual nearest neighbors, rather than k nearest neighbors. The advantage is that pseudo nearest neighbors can be identified and will not be taken into account during the prediction process. Consequently, the final learning result is more creditable. An extensive comparative experimental analysis carried out on UCI datasets provided empirical evidence of the effectiveness of the proposed method for enhancing the performance of the k-NN rule.
AbstractList	► A new lazy learning algorithm, named MkNNC, is designed for pattern classification. The MkNNC is an instance-based learning method. Its core idea is relatively intuitional and easy to implement. Meanwhile, it is more robust as encounter noises or inconsistent data. ► Anomalies will be firstly detected and removed from databases by the mutual nearest neighbors before constructing classification models. Consequently, the information of noise data will not be taken as determinant conditions during the learning process. Thus, the final prediction results are more creditable. ► The MkNNC involves classification learning and anomaly detection and elimination. Both of them are fulfilled with MNN, which carries more useful and reliable information than kNN in determining the relationship between instances. k nearest neighbor (kNN) is an effective and powerful lazy learning algorithm, notwithstanding its easy-to-implement. However, its performance heavily relies on the quality of training data. Due to many complex real-applications, noises coming from various possible sources are often prevalent in large scale databases. How to eliminate anomalies and improve the quality of data is still a challenge. To alleviate this problem, in this paper we propose a new anomaly removal and learning algorithm under the framework of kNN. The primary characteristic of our method is that the evidence of removing anomalies and predicting class labels of unseen instances is mutual nearest neighbors, rather than k nearest neighbors. The advantage is that pseudo nearest neighbors can be identified and will not be taken into account during the prediction process. Consequently, the final learning result is more creditable. An extensive comparative experimental analysis carried out on UCI datasets provided empirical evidence of the effectiveness of the proposed method for enhancing the performance of the k-NN rule. k nearest neighbor (kNN) is an effective and powerful lazy learning algorithm, notwithstanding its easy-to-implement. However, its performance heavily relies on the quality of training data. Due to many complex real-applications, noises coming from various possible sources are often prevalent in large scale databases. How to eliminate anomalies and improve the quality of data is still a challenge. To alleviate this problem, in this paper we propose a new anomaly removal and learning algorithm under the framework of kNN. The primary characteristic of our method is that the evidence of removing anomalies and predicting class labels of unseen instances is mutual nearest neighbors, rather than k nearest neighbors. The advantage is that pseudo nearest neighbors can be identified and will not be taken into account during the prediction process. Consequently, the final learning result is more creditable. An extensive comparative experimental analysis carried out on UCI datasets provided empirical evidence of the effectiveness of the proposed method for enhancing the performance of the k-NN rule. k nearest neighbor (kNN) is an effective and powerful lazy learning algorithm, notwithstanding its easy-to-implement. However, its performance heavily relies on the quality of training data. Due to many complex real-applications, noises coming from various possible sources are often prevalent in large scale databases. How to eliminate anomalies and improve the quality of data is still a challenge. To alleviate this problem, in this paper we propose a new anomaly removal and learning algorithm under the framework of kNN. The primary characteristic of our method is that the evidence of removing anomalies and predicting class labels of unseen instances is mutual nearest neighbors, rather than k nearest neighbors. The advantage is that pseudo nearest neighbors can be identified and will not be taken into account during the prediction process. Consequently, the final learning result is more creditable. An extensive comparative experimental analysis carried out on UCI datasets provided empirical evidence of the effectiveness of the proposed method for enhancing the performance of the k-NN rule. [PUBLICATION ABSTRACT]
Author	Liu, Huawen Zhang, Shichao
Author_xml	– sequence: 1 givenname: Huawen surname: Liu fullname: Liu, Huawen email: hwliu@zjnu.edu.cn organization: Department of Computer Science, Zhejiang Normal University, China – sequence: 2 givenname: Shichao surname: Zhang fullname: Zhang, Shichao email: zhangsc@it.uts.edu.au organization: College of Computer Science and Information Technology, Guangxi Normal University, China
BookMark	eNp9kL1OwzAURi1UJNrCA7BFTCwJthM3jphQxZ9UAQPMlms7xSGxi50g9e25JUwdOljXwzlX9_tmaOK8MwhdEpwRTBY3TdbEmFFMSEZohkl1gqaEl3lKKOUTNAWmgD-hZ2gWY4MxLimmU_T24m3cJVr2MjGt7ayTvfUuGaJ1m6Qb-kG2yVfqjAwm9okzdvO59iGp4alWxmhrq0YFXHDO0Wkt22gu_uccfTzcvy-f0tXr4_PybpWqnJV9uuaUK8oKzBnGFeG45IrzkkpaGE251kzXlaaM4HVutKI5Y1rmGi-4Kouc83yOrse92-C_B7hNdDYq07bSGT9EQTAE55CYAXp1gDZ-CA6uExVl5aICEqByhFTwMQZTC2X7v2B9kLaFfWJftGgEFC32RQtCBRQNJjkwt8F2MuyOOrejY6CiH2uCiMoap4y2waheaG-P2L8s5Jdr
CODEN	JSSODM
CitedBy_id	crossref_primary_10_1016_j_procs_2019_11_146 crossref_primary_10_1109_TCYB_2016_2519683 crossref_primary_10_1007_s11119_024_10112_5 crossref_primary_10_1109_TIM_2016_2526758 crossref_primary_10_1080_0305215X_2024_2315501 crossref_primary_10_1177_1550147719889899 crossref_primary_10_1016_j_jtcvs_2019_02_095 crossref_primary_10_1016_j_knosys_2021_107604 crossref_primary_10_1007_s42452_019_1356_9 crossref_primary_10_1109_ACCESS_2022_3210540 crossref_primary_10_3390_rs10050773 crossref_primary_10_1016_j_asoc_2017_02_020 crossref_primary_10_1016_j_eswa_2016_09_031 crossref_primary_10_3389_fneur_2019_00996 crossref_primary_10_1080_09540091_2022_2088695 crossref_primary_10_3390_s23031485 crossref_primary_10_31590_ejosat_905259 crossref_primary_10_18287_2412_6179_CO_667 crossref_primary_10_1080_08839514_2018_1430469 crossref_primary_10_1016_j_knosys_2020_106185 crossref_primary_10_1007_s00521_018_3836_z crossref_primary_10_1007_s11042_022_12873_5 crossref_primary_10_33203_mfy_1034155 crossref_primary_10_33793_acperpro_02_03_47 crossref_primary_10_1007_s00500_020_05311_x crossref_primary_10_1016_j_envsoft_2014_09_026 crossref_primary_10_1109_ACCESS_2024_3491073 crossref_primary_10_1016_j_ins_2015_07_016 crossref_primary_10_1016_j_knosys_2020_105803 crossref_primary_10_1016_j_artmed_2020_101985 crossref_primary_10_37989_gumussagbil_1321713 crossref_primary_10_1109_ACCESS_2021_3074249 crossref_primary_10_1016_j_knosys_2017_01_021 crossref_primary_10_3390_pr10030497 crossref_primary_10_1080_10406638_2015_1129976 crossref_primary_10_1002_cpe_6077 crossref_primary_10_1016_j_patrec_2016_01_021 crossref_primary_10_1109_TVT_2022_3181825 crossref_primary_10_31590_ejosat_1173627 crossref_primary_10_1016_j_ocemod_2024_102324 crossref_primary_10_1016_j_still_2025_106503 crossref_primary_10_1007_s10489_020_01926_7 crossref_primary_10_29137_umagd_510777 crossref_primary_10_1016_j_eswa_2022_117159 crossref_primary_10_1109_OJCOMS_2020_3024724 crossref_primary_10_1016_j_jss_2012_05_073 crossref_primary_10_18185_erzifbed_954466 crossref_primary_10_1016_j_jnca_2020_102783 crossref_primary_10_1016_j_neucom_2018_11_101 crossref_primary_10_1016_j_inffus_2025_102928 crossref_primary_10_1016_j_patcog_2023_110072 crossref_primary_10_1177_0020294019858088 crossref_primary_10_1016_j_ijleo_2020_164515 crossref_primary_10_3745_JIPS_2013_9_4_633 crossref_primary_10_1109_TKDE_2015_2411276 crossref_primary_10_1016_j_neucom_2014_06_009 crossref_primary_10_1186_s12888_024_05987_7 crossref_primary_10_1007_s11269_019_02273_0 crossref_primary_10_1109_ACCESS_2024_3518497 crossref_primary_10_3390_math10152743 crossref_primary_10_1109_TNNLS_2017_2711028 crossref_primary_10_1109_ACCESS_2021_3063028 crossref_primary_10_1109_TNNLS_2017_2673241 crossref_primary_10_3390_buildings13061552 crossref_primary_10_1016_j_aej_2021_04_100 crossref_primary_10_1016_j_knosys_2025_113343 crossref_primary_10_1016_j_neucom_2014_12_073 crossref_primary_10_1016_j_bspc_2023_105448 crossref_primary_10_32604_cmc_2023_040874 crossref_primary_10_1155_2022_9891971 crossref_primary_10_1007_s11629_023_8029_2
Cites_doi	10.1613/jair.346 10.1023/B:MACH.0000011805.60520.fe 10.1109/TPAMI.2009.164 10.1007/s10115-007-0114-2 10.1142/S0219622006002258 10.1016/j.patcog.2009.09.026 10.1016/j.ins.2009.04.012 10.1007/s00357-010-9044-x 10.1016/j.patcog.2008.10.028 10.1016/j.patcog.2005.08.016 10.1109/TIT.1967.1053964 10.1016/j.ins.2010.02.010 10.1109/TIT.1979.1056066 10.1109/TIT.2009.2037034 10.1109/TPAMI.2002.1033219 10.1016/j.ejor.2008.07.019 10.1007/s10618-010-0168-8 10.1016/j.ins.2009.11.045 10.1007/s10462-010-9165-y 10.1007/s10489-009-0207-6 10.1007/s00521-009-0295-6 10.1016/j.patcog.2010.02.008 10.1023/A:1014043630878 10.1109/TNN.2009.2018547 10.1016/j.datak.2009.08.005 10.1145/1541880.1541882
ContentType	Journal Article
Copyright	2011 Elsevier Inc. Copyright Elsevier Sequoia S.A. May 2012
Copyright_xml	– notice: 2011 Elsevier Inc. – notice: Copyright Elsevier Sequoia S.A. May 2012
DBID	AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D
DOI	10.1016/j.jss.2011.12.019
DatabaseName	CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional
DatabaseTitleList	Computer and Information Systems Abstracts Computer and Information Systems Abstracts
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1873-1228
EndPage	1074
ExternalDocumentID	2600142091 10_1016_j_jss_2011_12_019 S0164121211003049
Genre	Feature
GroupedDBID	--K --M -~X .DC .~1 0R~ 1B1 1~. 1~5 29L 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN 9M8 AABNK AACTN AAEDT AAEDW AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN AAYOK ABBOA ABEFU ABFNM ABFRF ABFSI ABJNI ABMAC ABTAH ABXDB ABYKQ ACDAQ ACGFO ACGFS ACGOD ACNNM ACRLP ACZNC ADBBV ADEZE ADHUB ADJOM ADMUD AEBSH AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AI. AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BKOJK BKOMP BLXMC CS3 DU5 E.L EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W KOM LG9 M41 MO0 MS~ N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 R2- RIG RNS ROL RPZ RXW SBC SDF SDG SDP SES SEW SPC SPCBC SSV SSZ T5K TAE TN5 TWZ UHS UNMZH VH1 WUQ XPP ZMT ZY4 ~G- AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 8FD AFXIZ AGCQF AGRNS JQ2 L7M L~C L~D SSH
ID	FETCH-LOGICAL-c357t-b828c25408500918078c8872a24ed28dd5df9d2510b3edc2355da3d068c743883
IEDL.DBID	.~1
ISSN	0164-1212
IngestDate	Thu Oct 02 04:12:56 EDT 2025 Fri Jul 25 07:22:29 EDT 2025 Thu Apr 24 22:56:44 EDT 2025 Thu Oct 02 04:26:54 EDT 2025 Fri Feb 23 02:32:34 EST 2024
IsPeerReviewed	true
IsScholarly	true
Issue	5
Keywords	kNN Data mining Mutual nearest neighbor Pattern classification Data reduction
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c357t-b828c25408500918078c8872a24ed28dd5df9d2510b3edc2355da3d068c743883
Notes	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23
PQID	925769228
PQPubID	45802
PageCount	8
ParticipantIDs	proquest_miscellaneous_1022881215 proquest_journals_925769228 crossref_citationtrail_10_1016_j_jss_2011_12_019 crossref_primary_10_1016_j_jss_2011_12_019 elsevier_sciencedirect_doi_10_1016_j_jss_2011_12_019
PublicationCentury	2000
PublicationDate	2012-05-01
PublicationDateYYYYMMDD	2012-05-01
PublicationDate_xml	– month: 05 year: 2012 text: 2012-05-01 day: 01
PublicationDecade	2010
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	The Journal of systems and software
PublicationYear	2012
Publisher	Elsevier Inc Elsevier Sequoia S.A
Publisher_xml	– name: Elsevier Inc – name: Elsevier Sequoia S.A
References	Latourrette (bib0100) 2000 Toyama, Kudo, Imai (bib0145) 2010; 43 Cohen (bib0035) 1995 Lee, Taddy, Gray (bib0105) 2010; 27 Domeniconi, Peng, Gunopulos (bib0055) 2002; 24 Ding, He (bib0050) 2004 Liu, Sun, Liu, Zhang (bib0120) 2009; 42 Wilson, Martinez (bib0150) 1997; 6 Zhang (bib0180) 2011; 35 Asuncion, Newman (bib0005) 2007 Chandola, Banerjee, Kumar (bib0020) 2009; 41 Chen, Hung, Yen, Fuh (bib0030) 2007; 40 Pyle (bib0135) 1999 Song, Huang, Zhou, Zha, Giles (bib0140) 2007 Hulse, Khoshgoftaar (bib0085) 2009; 68 Olvera-Lopez, Carrasco-Ochoa, Martinez-Trinidad, Kittler (bib0130) 2010; 34 Fayed, Atiya (bib0065) 2009; 20 Brighton, Mellish (bib0015) 2002; 6 Zollanvari, Braga-Neto, Dougherty (bib0185) 2010; 56 Bhatia, Vandana (bib0010) 2010; 8 Yang, Wu (bib0165) 2006; 5 Witten, Frank (bib0155) 2005 Yu (bib0170) 2011; 22 Jahromi, Parvinnia, John (bib0090) 2009; 179 Chapman, A.D., (2005). Principles and Methods of Data Cleaning-Primary Species and Species-Occurrence Data, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen. Davidson, Tayi (bib0045) 2009; 197 Lindenbaum, Markovitch, Rusakov (bib0110) 2004; 54 Wu, Kumar, Quinlan, Ghosh, Yang, Motoda, McLach-lan, Ng, Liu, Yu, Zhou, Steinbach, Hand, Stein-berg (bib0160) 2008; 14 Liu, Liu, Zhang (bib0115) 2010; 43 Marchiori (bib0125) 2010; 32 Cover, Hart (bib0040) 1967; 13 Duda, Hart, Stork (bib0060) 2001 Zhang, Wu (bib0175) 2010; 180 Garcia-Laencina, Sancho-Gomez, Figueiras-Vidal (bib0075) 2010; 19 Gowda, Krishna (bib0080) 1979; 25 Jin, Tung, Han, Wang (bib0095) 2006 Gao, Zheng, Chen, Li, Chen, Chen (bib0070) 2010; 180 Davidson (10.1016/j.jss.2011.12.019_bib0045) 2009; 197 Domeniconi (10.1016/j.jss.2011.12.019_bib0055) 2002; 24 Gao (10.1016/j.jss.2011.12.019_bib0070) 2010; 180 10.1016/j.jss.2011.12.019_bib0025 Garcia-Laencina (10.1016/j.jss.2011.12.019_bib0075) 2010; 19 Wu (10.1016/j.jss.2011.12.019_bib0160) 2008; 14 Jin (10.1016/j.jss.2011.12.019_bib0095) 2006 Chen (10.1016/j.jss.2011.12.019_bib0030) 2007; 40 Zhang (10.1016/j.jss.2011.12.019_bib0175) 2010; 180 Zollanvari (10.1016/j.jss.2011.12.019_bib0185) 2010; 56 Lee (10.1016/j.jss.2011.12.019_bib0105) 2010; 27 Yang (10.1016/j.jss.2011.12.019_bib0165) 2006; 5 Bhatia (10.1016/j.jss.2011.12.019_bib0010) 2010; 8 Marchiori (10.1016/j.jss.2011.12.019_bib0125) 2010; 32 Yu (10.1016/j.jss.2011.12.019_bib0170) 2011; 22 Gowda (10.1016/j.jss.2011.12.019_bib0080) 1979; 25 Jahromi (10.1016/j.jss.2011.12.019_bib0090) 2009; 179 Toyama (10.1016/j.jss.2011.12.019_bib0145) 2010; 43 Song (10.1016/j.jss.2011.12.019_bib0140) 2007 Latourrette (10.1016/j.jss.2011.12.019_bib0100) 2000 Witten (10.1016/j.jss.2011.12.019_bib0155) 2005 Ding (10.1016/j.jss.2011.12.019_bib0050) 2004 Zhang (10.1016/j.jss.2011.12.019_bib0180) 2011; 35 Asuncion (10.1016/j.jss.2011.12.019_bib0005) 2007 Olvera-Lopez (10.1016/j.jss.2011.12.019_bib0130) 2010; 34 Cover (10.1016/j.jss.2011.12.019_bib0040) 1967; 13 Wilson (10.1016/j.jss.2011.12.019_bib0150) 1997; 6 Liu (10.1016/j.jss.2011.12.019_bib0115) 2010; 43 Duda (10.1016/j.jss.2011.12.019_bib0060) 2001 Liu (10.1016/j.jss.2011.12.019_bib0120) 2009; 42 Pyle (10.1016/j.jss.2011.12.019_bib0135) 1999 Cohen (10.1016/j.jss.2011.12.019_bib0035) 1995 Lindenbaum (10.1016/j.jss.2011.12.019_bib0110) 2004; 54 Brighton (10.1016/j.jss.2011.12.019_bib0015) 2002; 6 Chandola (10.1016/j.jss.2011.12.019_bib0020) 2009; 41 Fayed (10.1016/j.jss.2011.12.019_bib0065) 2009; 20 Hulse (10.1016/j.jss.2011.12.019_bib0085) 2009; 68
References_xml	– year: 2001 ident: bib0060 article-title: Pattern Classification – volume: 27 start-page: 41 year: 2010 end-page: 53 ident: bib0105 article-title: Selection of a representative sample publication-title: Journal of Classification – volume: 13 start-page: 21 year: 1967 end-page: 27 ident: bib0040 article-title: Nearest neighbor pattern classification publication-title: IEEE Transactions on Information Theory – year: 1999 ident: bib0135 article-title: Data Preparation for Data Mining – volume: 6 start-page: 153 year: 2002 end-page: 172 ident: bib0015 article-title: Advances in instance selection for instance-based learning algorithms publication-title: Data Mining and Knowledge Discovery – year: 2007 ident: bib0005 article-title: UCI Repository of Ma-chine Learning Databases, Department of Information and Computer Science – volume: 43 start-page: 2763 year: 2010 end-page: 2772 ident: bib0115 article-title: Ensemble gene selection for cancer classification publication-title: Pattern Recognition – volume: 180 start-page: 2170 year: 2010 end-page: 2195 ident: bib0070 article-title: Efficient mutual nearest neighbor query processing for moving object trajectories publication-title: Information Sciences – volume: 180 start-page: 2663 year: 2010 end-page: 2673 ident: bib0175 article-title: Integrating induction and deduction for noisy data mining publication-title: Information Sciences – volume: 8 start-page: 302 year: 2010 end-page: 305 ident: bib0010 article-title: Survey of nearest neighbor techniques publication-title: International Journal of Computer Science and Information Security – volume: 24 start-page: 1281 year: 2002 end-page: 1285 ident: bib0055 article-title: Locally adaptive metric nearest-neighbor classification publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence – volume: 20 start-page: 890 year: 2009 end-page: 896 ident: bib0065 article-title: A novel template reduction approach for the k-nearest neighbor method publication-title: IEEE Transactions on Neural Networks – volume: 22 start-page: 1 year: 2011 end-page: 30 ident: bib0170 article-title: Selective sampling techniques for feedback-based data retrieval publication-title: Data Mining and Knowledge Discovery – start-page: 115 year: 1995 end-page: 123 ident: bib0035 article-title: Fast effective rule induction publication-title: Proceedings of the 12th International Conference on Machine Learning – volume: 43 start-page: 1361 year: 2010 end-page: 1372 ident: bib0145 article-title: Probably correct k-nearest neighbor search in high dimensions publication-title: Pattern Recognition – reference: Chapman, A.D., (2005). Principles and Methods of Data Cleaning-Primary Species and Species-Occurrence Data, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen. – volume: 197 start-page: 764 year: 2009 end-page: 772 ident: bib0045 article-title: Data preparation using data quality matrices for classification mining publication-title: European Journal of Operational Research – start-page: 238 year: 2000 end-page: 245 ident: bib0100 article-title: Toward an explanatory similarity measure for nearest-neighbor classification publication-title: Proc. of the 11th European Conference on Machine Learning – volume: 68 start-page: 1513 year: 2009 end-page: 1542 ident: bib0085 article-title: Knowledge discovery from imbalanced and noisy data publication-title: Data and Knowledge Engineering – volume: 19 start-page: 263 year: 2010 end-page: 282 ident: bib0075 article-title: Pattern classification with missing data: a review publication-title: Neural Computing & Applications – start-page: 248 year: 2007 end-page: 264 ident: bib0140 article-title: IKNN: informative k-nearest neighbor pattern classification publication-title: PKDD 2007, LNAI 4702 – volume: 42 start-page: 1330 year: 2009 end-page: 1339 ident: bib0120 article-title: Feature selection with dynamic mutual information publication-title: Pattern Recognition – start-page: 577 year: 2006 end-page: 593 ident: bib0095 article-title: Ranking outliers using symmetric neighborhood relationship publication-title: Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) – volume: 40 start-page: 360 year: 2007 end-page: 375 ident: bib0030 article-title: Fast and versatile algorithm for nearest neighbor search based on a lower bound tree publication-title: Pattern Recognition – volume: 41 start-page: 1 year: 2009 end-page: 58 ident: bib0020 article-title: Anomaly detection: a survey publication-title: ACM Computing Surveys – volume: 32 start-page: 364 year: 2010 end-page: 370 ident: bib0125 article-title: Class conditional nearest neighbor for large margin instance selection publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence – volume: 14 start-page: 1 year: 2008 end-page: 37 ident: bib0160 article-title: Top 10 algorithms in data mining publication-title: Knowledge and Information Systems – volume: 25 start-page: 488 year: 1979 end-page: 490 ident: bib0080 article-title: The condensed nearest neighbor rule using the concept of mutual nearest neighborhood publication-title: IEEE Transactions on Information Theory – volume: 56 start-page: 784 year: 2010 end-page: 804 ident: bib0185 article-title: Joint sampling distribution between actual and estimated classification errors for linear discriminant analysis publication-title: IEEE Transactions on Information Theory – volume: 54 start-page: 125 year: 2004 end-page: 152 ident: bib0110 article-title: Selective sampling for nearest neighbor classifiers publication-title: Machine Learning – year: 2005 ident: bib0155 article-title: Data Mining-Practical Machine Learning Tools and Techniques with JAVA Implementations – volume: 35 start-page: 123 year: 2011 end-page: 133 ident: bib0180 article-title: Shell-neighbor method and its application in missing data imputation publication-title: Applied Intelligence – volume: 179 start-page: 2964 year: 2009 end-page: 2973 ident: bib0090 article-title: A method of learning weighted similarity function to improve the performance of nearest neighbor publication-title: Information Sciences – volume: 6 start-page: 1 year: 1997 end-page: 34 ident: bib0150 article-title: Improved heterogeneous distance functions publication-title: Journal of Artificial Intelligence Research – volume: 5 start-page: 597 year: 2006 end-page: 604 ident: bib0165 article-title: 10 challenging problems in data mining research publication-title: International Journal of Information Technology and Decision Making – volume: 34 start-page: 133 year: 2010 end-page: 143 ident: bib0130 article-title: A review of instance selection methods publication-title: Artificial Intelligence Review – start-page: 584 year: 2004 end-page: 589 ident: bib0050 article-title: K-nearest-neighbor consistency in data clustering: incorporating local information into global optimization publication-title: Proceedings of ACM Symposium on Applied Computing (SAC) – volume: 6 start-page: 1 year: 1997 ident: 10.1016/j.jss.2011.12.019_bib0150 article-title: Improved heterogeneous distance functions publication-title: Journal of Artificial Intelligence Research doi: 10.1613/jair.346 – volume: 54 start-page: 125 year: 2004 ident: 10.1016/j.jss.2011.12.019_bib0110 article-title: Selective sampling for nearest neighbor classifiers publication-title: Machine Learning doi: 10.1023/B:MACH.0000011805.60520.fe – volume: 32 start-page: 364 issue: 2 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0125 article-title: Class conditional nearest neighbor for large margin instance selection publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence doi: 10.1109/TPAMI.2009.164 – volume: 14 start-page: 1 year: 2008 ident: 10.1016/j.jss.2011.12.019_bib0160 article-title: Top 10 algorithms in data mining publication-title: Knowledge and Information Systems doi: 10.1007/s10115-007-0114-2 – volume: 5 start-page: 597 issue: 4 year: 2006 ident: 10.1016/j.jss.2011.12.019_bib0165 article-title: 10 challenging problems in data mining research publication-title: International Journal of Information Technology and Decision Making doi: 10.1142/S0219622006002258 – volume: 43 start-page: 1361 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0145 article-title: Probably correct k-nearest neighbor search in high dimensions publication-title: Pattern Recognition doi: 10.1016/j.patcog.2009.09.026 – volume: 179 start-page: 2964 year: 2009 ident: 10.1016/j.jss.2011.12.019_bib0090 article-title: A method of learning weighted similarity function to improve the performance of nearest neighbor publication-title: Information Sciences doi: 10.1016/j.ins.2009.04.012 – start-page: 584 year: 2004 ident: 10.1016/j.jss.2011.12.019_bib0050 article-title: K-nearest-neighbor consistency in data clustering: incorporating local information into global optimization – volume: 27 start-page: 41 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0105 article-title: Selection of a representative sample publication-title: Journal of Classification doi: 10.1007/s00357-010-9044-x – volume: 42 start-page: 1330 year: 2009 ident: 10.1016/j.jss.2011.12.019_bib0120 article-title: Feature selection with dynamic mutual information publication-title: Pattern Recognition doi: 10.1016/j.patcog.2008.10.028 – year: 2005 ident: 10.1016/j.jss.2011.12.019_bib0155 – volume: 40 start-page: 360 year: 2007 ident: 10.1016/j.jss.2011.12.019_bib0030 article-title: Fast and versatile algorithm for nearest neighbor search based on a lower bound tree publication-title: Pattern Recognition doi: 10.1016/j.patcog.2005.08.016 – volume: 13 start-page: 21 year: 1967 ident: 10.1016/j.jss.2011.12.019_bib0040 article-title: Nearest neighbor pattern classification publication-title: IEEE Transactions on Information Theory doi: 10.1109/TIT.1967.1053964 – volume: 180 start-page: 2170 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0070 article-title: Efficient mutual nearest neighbor query processing for moving object trajectories publication-title: Information Sciences doi: 10.1016/j.ins.2010.02.010 – volume: 25 start-page: 488 issue: 4 year: 1979 ident: 10.1016/j.jss.2011.12.019_bib0080 article-title: The condensed nearest neighbor rule using the concept of mutual nearest neighborhood publication-title: IEEE Transactions on Information Theory doi: 10.1109/TIT.1979.1056066 – volume: 56 start-page: 784 issue: 2 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0185 article-title: Joint sampling distribution between actual and estimated classification errors for linear discriminant analysis publication-title: IEEE Transactions on Information Theory doi: 10.1109/TIT.2009.2037034 – volume: 24 start-page: 1281 issue: 9 year: 2002 ident: 10.1016/j.jss.2011.12.019_bib0055 article-title: Locally adaptive metric nearest-neighbor classification publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence doi: 10.1109/TPAMI.2002.1033219 – volume: 197 start-page: 764 year: 2009 ident: 10.1016/j.jss.2011.12.019_bib0045 article-title: Data preparation using data quality matrices for classification mining publication-title: European Journal of Operational Research doi: 10.1016/j.ejor.2008.07.019 – volume: 22 start-page: 1 issue: 1 year: 2011 ident: 10.1016/j.jss.2011.12.019_bib0170 article-title: Selective sampling techniques for feedback-based data retrieval publication-title: Data Mining and Knowledge Discovery doi: 10.1007/s10618-010-0168-8 – volume: 8 start-page: 302 issue: 2 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0010 article-title: Survey of nearest neighbor techniques publication-title: International Journal of Computer Science and Information Security – volume: 180 start-page: 2663 issue: 14 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0175 article-title: Integrating induction and deduction for noisy data mining publication-title: Information Sciences doi: 10.1016/j.ins.2009.11.045 – volume: 34 start-page: 133 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0130 article-title: A review of instance selection methods publication-title: Artificial Intelligence Review doi: 10.1007/s10462-010-9165-y – volume: 35 start-page: 123 issue: 1 year: 2011 ident: 10.1016/j.jss.2011.12.019_bib0180 article-title: Shell-neighbor method and its application in missing data imputation publication-title: Applied Intelligence doi: 10.1007/s10489-009-0207-6 – start-page: 115 year: 1995 ident: 10.1016/j.jss.2011.12.019_bib0035 article-title: Fast effective rule induction – ident: 10.1016/j.jss.2011.12.019_bib0025 – volume: 19 start-page: 263 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0075 article-title: Pattern classification with missing data: a review publication-title: Neural Computing & Applications doi: 10.1007/s00521-009-0295-6 – start-page: 238 year: 2000 ident: 10.1016/j.jss.2011.12.019_bib0100 article-title: Toward an explanatory similarity measure for nearest-neighbor classification – volume: 43 start-page: 2763 issue: 8 year: 2010 ident: 10.1016/j.jss.2011.12.019_bib0115 article-title: Ensemble gene selection for cancer classification publication-title: Pattern Recognition doi: 10.1016/j.patcog.2010.02.008 – start-page: 248 year: 2007 ident: 10.1016/j.jss.2011.12.019_bib0140 article-title: IKNN: informative k-nearest neighbor pattern classification – year: 2007 ident: 10.1016/j.jss.2011.12.019_bib0005 – volume: 6 start-page: 153 year: 2002 ident: 10.1016/j.jss.2011.12.019_bib0015 article-title: Advances in instance selection for instance-based learning algorithms publication-title: Data Mining and Knowledge Discovery doi: 10.1023/A:1014043630878 – volume: 20 start-page: 890 issue: 5 year: 2009 ident: 10.1016/j.jss.2011.12.019_bib0065 article-title: A novel template reduction approach for the k-nearest neighbor method publication-title: IEEE Transactions on Neural Networks doi: 10.1109/TNN.2009.2018547 – year: 1999 ident: 10.1016/j.jss.2011.12.019_bib0135 – start-page: 577 year: 2006 ident: 10.1016/j.jss.2011.12.019_bib0095 article-title: Ranking outliers using symmetric neighborhood relationship – volume: 68 start-page: 1513 year: 2009 ident: 10.1016/j.jss.2011.12.019_bib0085 article-title: Knowledge discovery from imbalanced and noisy data publication-title: Data and Knowledge Engineering doi: 10.1016/j.datak.2009.08.005 – volume: 41 start-page: 1 issue: 3 year: 2009 ident: 10.1016/j.jss.2011.12.019_bib0020 article-title: Anomaly detection: a survey publication-title: ACM Computing Surveys doi: 10.1145/1541880.1541882 – year: 2001 ident: 10.1016/j.jss.2011.12.019_bib0060
SSID	ssj0007202
Score	2.3103979
Snippet	► A new lazy learning algorithm, named MkNNC, is designed for pattern classification. The MkNNC is an instance-based learning method. Its core idea is... k nearest neighbor (kNN) is an effective and powerful lazy learning algorithm, notwithstanding its easy-to-implement. However, its performance heavily relies...
SourceID	proquest crossref elsevier
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	1067
SubjectTerms	Algorithms Anomalies Artificial intelligence Classification Computer programs Data mining Data reduction kNN Learning Mutual nearest neighbor Noise Pattern classification Pattern recognition Software Studies Systems design
Title	Noisy data elimination using mutual k-nearest neighbor for classification mining
URI	https://dx.doi.org/10.1016/j.jss.2011.12.019 https://www.proquest.com/docview/925769228 https://www.proquest.com/docview/1022881215
Volume	85
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 1873-1228 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007202 issn: 0164-1212 databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Complete Freedom Collection [SCCMFC] customDbUrl: eissn: 1873-1228 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007202 issn: 0164-1212 databaseCode: ACRLP dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals [SCFCJ] customDbUrl: eissn: 1873-1228 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007202 issn: 0164-1212 databaseCode: AIKHN dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Science Direct customDbUrl: eissn: 1873-1228 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007202 issn: 0164-1212 databaseCode: .~1 dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVLSH databaseName: Elsevier Journals customDbUrl: mediaType: online eissn: 1873-1228 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007202 issn: 0164-1212 databaseCode: AKRWK dateStart: 19950101 isFulltext: true providerName: Library Specific Holdings
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NS8MwFA9jXrz4Lc7piOBJqFvTdk2OYzim4hB0sFtoPiqbWzdsd_Di3-57_VAU2cFj04SWl-Tl95Jffo-Qy4h5KlQcItXAQIACmNxR2viOG1nj6RAQf4Qb-g-j7nDs302CSY30q7swSKssfX_h03NvXZa0S2u2V9Np-wnFoVyGEmX5-R5e4vP9ELMYXH980zxClvMOsbKDtauTzZzjNUvTQsUTdwRRbOfvtemXl86XnsEe2SkxI-0Vv7VPajY5ILtVPgZaTs9D8jhaTtN3iqRPaud5ui40O0Vu-wtdrPGqCH11EpStTTOa4K4oDAEKuJVqRNFIGyqaLPK8EUdkPLh57g-dMmOCo70gzBwF8ZOGkA916AAIoJa8Bi_CIuZbw7gxgYmFAUjTUZ41mgHYMJFnOl2uAUlw7h2TerJM7AmhxhXWi0Ptqlj4gVDKj40VcWitdkNuRYN0KltJXcqJY1aLuax4YzMJ5pVoXukyCeZtkKuvJqtCS2NTZb_qAPljQEjw9ZuaNavOkuVsTKXAqEowxhvk4ustTCM8G4kSu1ynEgNfzlFq4_R_H26SbXhiBRfyjNSzt7U9B7ySqVY-IFtkq3d7Pxx9AiwE6co
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwEB2xHODCjiirkTghhTZOQuwjQqCyVUi0EjcrXoJa2hSR9sCFb2cmCwiEOHCNbTka2-M39vMbgKOEBzrWAiPVyGKAgpjc08aGnp84G5gYEX9CB_p3ndN2L7x-jB5n4Lx-C0O0ysr3lz698NbVl2ZlzeZLv998IHEon5NEWXG_J2dhPox4TBHYyfsXzyPmBfGQantUvb7aLEhegzwvZTzpSJDUdn7fnH646WLvuVyBpQo0srPyv1ZhxmVrsFwnZGDV-lyH-864n78xYn0yNyzydZHdGZHbn9hoSm9F2LOXkW5tPmEZHYviHGAIXJkhGE28obLJqEgcsQG9y4vuedurUiZ4JojiiacxgDIY85EQHSIBEpM36EZ4wkNnubA2sqm0iGlaOnDWcEQbNgls61QYhBJCBJswl40ztwXM-tIFaWx8ncowklqHqXUyjZ0zfiycbECrtpUylZ44pbUYqpo4NlBoXkXmVT5XaN4GHH82eSnFNP6qHNYDoL7NCIXO_q9mO_VgqWo55kpSWCU5Fw04_CzFdUSXI0nmxtNcUeQrBGltbP-v4wNYaHfvbtXtVedmBxaxhJfEyF2Ym7xO3R6Cl4neLybnB6Nc618
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Noisy+data+elimination+using+mutual+k-nearest+neighbor+for+classification+mining&rft.jtitle=The+Journal+of+systems+and+software&rft.au=Liu%2C+Huawen&rft.au=Zhang%2C+Shichao&rft.date=2012-05-01&rft.pub=Elsevier+Sequoia+S.A&rft.issn=0164-1212&rft.eissn=1873-1228&rft.volume=85&rft.issue=5&rft.spage=1067&rft_id=info:doi/10.1016%2Fj.jss.2011.12.019&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=2600142091
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0164-1212&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0164-1212&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0164-1212&client=summon