Robust Multi-Kernel Nearest Neighborhood for Outlier Detection

Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances among instances heavily depend on the feature space in which they reside. For an outlier, distances from it to the normal instances may be e...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 36; no. 8; pp. 4220 - 4231
Main Authors Wang, Xinye, Duan, Lei, Yu, Zhenyang, He, Chengxin, Bao, Zhifeng
Format Journal Article
LanguageEnglish
Published New York IEEE 01.08.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1041-4347
1558-2191
DOI10.1109/TKDE.2024.3364179

Cover

Abstract Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances among instances heavily depend on the feature space in which they reside. For an outlier, distances from it to the normal instances may be extremely close in one feature space, failing to separate them from each other, while this situation is reversed in another space. Meanwhile, the distance measure is sensitive to a few "marginal instances" (i.e., normal instances located very close to outliers in the feature space) during the estimation of whether a test instance is an outlier or not. In this article, we propose a r obust m ulti- k ernel nearest n eighborhood (RMKN) method for outlier detection. Specifically, in the training phase, we only consider normal instances and transform them into a Polynomial kernel function weighted digraph to capture their geometric relationships in the original feature space. Then, we develop an objective function based on the weighted digraph to find a latent feature space via multi-kernel learning such that distances among normal instances in this latent feature space are as close as possible while preserving their original distributions. In the detecting phase, we design an outlying score based on the two-stage multi-kernel <inline-formula><tex-math notation="LaTeX">k</tex-math> <mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic xlink:href="wang-ieq1-3364179.gif"/> </inline-formula>-nearest nearest neighbors to detect outliers. Extensive experiments with ten datasets show that RMKN is effective and robust.
AbstractList Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances among instances heavily depend on the feature space in which they reside. For an outlier, distances from it to the normal instances may be extremely close in one feature space, failing to separate them from each other, while this situation is reversed in another space. Meanwhile, the distance measure is sensitive to a few "marginal instances" (i.e., normal instances located very close to outliers in the feature space) during the estimation of whether a test instance is an outlier or not. In this article, we propose a r obust m ulti- k ernel nearest n eighborhood (RMKN) method for outlier detection. Specifically, in the training phase, we only consider normal instances and transform them into a Polynomial kernel function weighted digraph to capture their geometric relationships in the original feature space. Then, we develop an objective function based on the weighted digraph to find a latent feature space via multi-kernel learning such that distances among normal instances in this latent feature space are as close as possible while preserving their original distributions. In the detecting phase, we design an outlying score based on the two-stage multi-kernel <inline-formula><tex-math notation="LaTeX">k</tex-math> <mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic xlink:href="wang-ieq1-3364179.gif"/> </inline-formula>-nearest nearest neighbors to detect outliers. Extensive experiments with ten datasets show that RMKN is effective and robust.
Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances among instances heavily depend on the feature space in which they reside. For an outlier, distances from it to the normal instances may be extremely close in one feature space, failing to separate them from each other, while this situation is reversed in another space. Meanwhile, the distance measure is sensitive to a few “marginal instances” (i.e., normal instances located very close to outliers in the feature space) during the estimation of whether a test instance is an outlier or not. In this article, we propose a r obust m ulti- k ernel nearest n eighborhood (RMKN) method for outlier detection. Specifically, in the training phase, we only consider normal instances and transform them into a Polynomial kernel function weighted digraph to capture their geometric relationships in the original feature space. Then, we develop an objective function based on the weighted digraph to find a latent feature space via multi-kernel learning such that distances among normal instances in this latent feature space are as close as possible while preserving their original distributions. In the detecting phase, we design an outlying score based on the two-stage multi-kernel [Formula Omitted]-nearest nearest neighbors to detect outliers. Extensive experiments with ten datasets show that RMKN is effective and robust.
Author He, Chengxin
Duan, Lei
Bao, Zhifeng
Wang, Xinye
Yu, Zhenyang
Author_xml – sequence: 1
  givenname: Xinye
  orcidid: 0000-0003-2095-6117
  surname: Wang
  fullname: Wang, Xinye
  email: wangxinye@stu.scu.edu.cn
  organization: School of Computer Science, Sichuan University, Chengdu, China
– sequence: 2
  givenname: Lei
  orcidid: 0000-0001-7254-1832
  surname: Duan
  fullname: Duan, Lei
  email: leiduan@scu.edu.cn
  organization: School of Computer Science, Sichuan University, Chengdu, China
– sequence: 3
  givenname: Zhenyang
  orcidid: 0000-0001-9198-9851
  surname: Yu
  fullname: Yu, Zhenyang
  email: yuzhenyang@stu.scu.edu.cn
  organization: School of Computer Science, Sichuan University, Chengdu, China
– sequence: 4
  givenname: Chengxin
  orcidid: 0000-0003-3759-3914
  surname: He
  fullname: He, Chengxin
  email: hechengxin@stu.scu.edu.cn
  organization: School of Computer Science, Sichuan University, Chengdu, China
– sequence: 5
  givenname: Zhifeng
  orcidid: 0000-0003-2477-381X
  surname: Bao
  fullname: Bao, Zhifeng
  email: zhifeng.bao@rmit.edu.au
  organization: RMIT University, Melbourne, VIC, Australia
BookMark eNp9UE1Lw0AQXaSCbfUHCB4CnlP3s5u9CNLWD1pbkHpeNsnEbonZutkc_PduaA_iQRiYYWbevDdvhAaNawCha4InhGB1t13OFxOKKZ8wNuVEqjM0JEJkKSWKDGKNOUk54_ICjdp2jzHOZEaG6P7N5V0bkteuDjZdgm-gTtZgPMTmGuzHLnd-51yZVM4nmy7UFnwyhwBFsK65ROeVqVu4OuUxen9cbGfP6Wrz9DJ7WKUFVTykmTGK51ACFkoonuWZzAnHZSVp1MeLkst8WlBBGCfKSEKVkJJLqjDOBRaMjdHt8e7Bu68uatN71_kmUmqGZRaDMxm3yHGr8K5tPVT64O2n8d-aYN3bpHubdG-TPtkUMfIPprDB9L8Fb2z9L_LmiLQA8IuJMxan7AdZS3Tn
CODEN ITKEEH
CitedBy_id crossref_primary_10_3390_math12132048
crossref_primary_10_1007_s10115_024_02324_y
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TKDE.2024.3364179
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE All-Society Periodicals Package (ASPP) 1998-Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2191
EndPage 4231
ExternalDocumentID 10_1109_TKDE_2024_3364179
10433793
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61972268
  funderid: 10.13039/501100001809
– fundername: ARC
  grantid: DP220101434; DP240101211
GroupedDBID -~X
.DC
0R~
1OL
29I
4.4
5GY
5VS
6IK
97E
9M8
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RXW
RZB
TAE
TAF
TN5
UHB
VH1
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c294t-8aa94bede0595948b87b140df722194cd47b6c2513419a7129577472900b50533
IEDL.DBID RIE
ISSN 1041-4347
IngestDate Mon Jun 30 03:55:47 EDT 2025
Wed Oct 01 02:06:31 EDT 2025
Thu Apr 24 23:04:06 EDT 2025
Wed Aug 27 02:05:20 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 8
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c294t-8aa94bede0595948b87b140df722194cd47b6c2513419a7129577472900b50533
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-9198-9851
0000-0003-3759-3914
0000-0003-2095-6117
0000-0003-2477-381X
0000-0001-7254-1832
PQID 3078078437
PQPubID 85438
PageCount 12
ParticipantIDs crossref_primary_10_1109_TKDE_2024_3364179
ieee_primary_10433793
proquest_journals_3078078437
crossref_citationtrail_10_1109_TKDE_2024_3364179
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-08-01
PublicationDateYYYYMMDD 2024-08-01
PublicationDate_xml – month: 08
  year: 2024
  text: 2024-08-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on knowledge and data engineering
PublicationTitleAbbrev TKDE
PublicationYear 2024
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
SSID ssj0008781
Score 2.4641695
Snippet Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 4220
SubjectTerms Anomaly detection
Data analysis
Effectiveness
Feature extraction
Games
Graph theory
Kernel
Kernel functions
multi-kernel learning
nearest neighborhood
Outlier detection
Outliers (statistics)
Polynomials
Robustness
Support vector machines
Task analysis
Training
weighted digraph
Title Robust Multi-Kernel Nearest Neighborhood for Outlier Detection
URI https://ieeexplore.ieee.org/document/10433793
https://www.proquest.com/docview/3078078437
Volume 36
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2191
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0008781
  issn: 1041-4347
  databaseCode: RIE
  dateStart: 19890101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0IJz2IIkYUTQ-eTIqlu3TpxcQIhEiCiYGEW7NfXCTFQHvx1zuzLQQ1Gm9Ns9tudmZn3-zOvAG41bpnI9zZKLsdHRRlmC8Zl74h5yI2Ngy0C5CdRKMZf55352WyusuFsda64DPbpkd3l29WOqejMlzhnDFUqApUhIiLZK2d2e0JV5EU2-A_GRflFWYniO-n4_4AXcGQtxmLqOTWl03IVVX5YYrd_jKswWQ7siKs5K2dZ6qtP76RNv576CdwXCJN77FQjVM4sGkdatsqDl65qOtwtEdJeAYPryuVbzLPJeb6Y7tO7dKbENEtvpzQOSoqDVEhe4h2vZc8Qwy79vo2cyFdaQNmw8H0aeSXNRZ8HcY883tSxlxZYxFmEXOL6gmFPpdZiBBtGdeGCxVpBEHE-yYFooMuAkZE5EGgupTHew7VdJXaC_A6mvY2FaH1lBxRgZLSMGZkVythF6FtQrCd9ESXBORUB2OZOEckiBOSU0JySko5NeFu1-W9YN_4q3GD5n2vYTHlTWhtRZuUC3SToGkjpn3OxOUv3a7gkL5eBPu1oJqtc3uNACRTN07xPgHUVNOj
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLZ4HIADgwFiMKAHTkgdpUmb9YKEGNNgUCS0SdyqvHYBdWi0F349dtohHgJxq6pEiWLH-ZzYnwGOte7aGE82ym5HB0UZ5kvGpW_IuUiMDQPtAmTTeDDmN4_RY52s7nJhrLUu-Mx26NO95ZupLumqDHc4ZwwVahGWI3QrRJWu9WF4u8LVJMVWOCrjon7EPAuS09Gwd4XOYMg7jMVUdOvLMeTqqvwwxu6E6Tcgnc-tCix56pSF6ui3b7SN_578BqzXWNO7qJRjExZs3oTGvI6DV2_rJqx9IiXcgvOHqSpfC8-l5vpDO8vts5cS1S3-TOkmFdWGyJA9xLvefVkgip15PVu4oK58G8b9q9HlwK-rLPg6THjhd6VMuLLGItAi7hbVFQq9LjMRIVozrg0XKtYIg4j5TQrEBxFCRsTkQaAiyuTdgaV8mttd8M40nW4qRvspOeICJaVhzMhIK2EnoW1BMF_0TNcU5FQJ4zlzrkiQZCSnjOSU1XJqwclHl5eKf-Ovxtu07p8aVkvegvZctFm9RV8zNG7Etc-Z2Pul2xGsDEZ3t9ntdTrch1UaqQr9a8NSMSvtAcKRQh06JXwHIRnW9A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Robust+Multi-Kernel+Nearest+Neighborhood+for+Outlier+Detection&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Wang%2C+Xinye&rft.au=Duan%2C+Lei&rft.au=Yu%2C+Zhenyang&rft.au=He%2C+Chengxin&rft.date=2024-08-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1041-4347&rft.eissn=1558-2191&rft.volume=36&rft.issue=8&rft.spage=4220&rft_id=info:doi/10.1109%2FTKDE.2024.3364179&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon