Robust Multi-Kernel Nearest Neighborhood for Outlier Detection
Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances among instances heavily depend on the feature space in which they reside. For an outlier, distances from it to the normal instances may be e...
Saved in:
Published in | IEEE transactions on knowledge and data engineering Vol. 36; no. 8; pp. 4220 - 4231 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.08.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 1041-4347 1558-2191 |
DOI | 10.1109/TKDE.2024.3364179 |
Cover
Abstract | Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances among instances heavily depend on the feature space in which they reside. For an outlier, distances from it to the normal instances may be extremely close in one feature space, failing to separate them from each other, while this situation is reversed in another space. Meanwhile, the distance measure is sensitive to a few "marginal instances" (i.e., normal instances located very close to outliers in the feature space) during the estimation of whether a test instance is an outlier or not. In this article, we propose a r obust m ulti- k ernel nearest n eighborhood (RMKN) method for outlier detection. Specifically, in the training phase, we only consider normal instances and transform them into a Polynomial kernel function weighted digraph to capture their geometric relationships in the original feature space. Then, we develop an objective function based on the weighted digraph to find a latent feature space via multi-kernel learning such that distances among normal instances in this latent feature space are as close as possible while preserving their original distributions. In the detecting phase, we design an outlying score based on the two-stage multi-kernel <inline-formula><tex-math notation="LaTeX">k</tex-math> <mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic xlink:href="wang-ieq1-3364179.gif"/> </inline-formula>-nearest nearest neighbors to detect outliers. Extensive experiments with ten datasets show that RMKN is effective and robust. |
---|---|
AbstractList | Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances among instances heavily depend on the feature space in which they reside. For an outlier, distances from it to the normal instances may be extremely close in one feature space, failing to separate them from each other, while this situation is reversed in another space. Meanwhile, the distance measure is sensitive to a few "marginal instances" (i.e., normal instances located very close to outliers in the feature space) during the estimation of whether a test instance is an outlier or not. In this article, we propose a r obust m ulti- k ernel nearest n eighborhood (RMKN) method for outlier detection. Specifically, in the training phase, we only consider normal instances and transform them into a Polynomial kernel function weighted digraph to capture their geometric relationships in the original feature space. Then, we develop an objective function based on the weighted digraph to find a latent feature space via multi-kernel learning such that distances among normal instances in this latent feature space are as close as possible while preserving their original distributions. In the detecting phase, we design an outlying score based on the two-stage multi-kernel <inline-formula><tex-math notation="LaTeX">k</tex-math> <mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic xlink:href="wang-ieq1-3364179.gif"/> </inline-formula>-nearest nearest neighbors to detect outliers. Extensive experiments with ten datasets show that RMKN is effective and robust. Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances among instances heavily depend on the feature space in which they reside. For an outlier, distances from it to the normal instances may be extremely close in one feature space, failing to separate them from each other, while this situation is reversed in another space. Meanwhile, the distance measure is sensitive to a few “marginal instances” (i.e., normal instances located very close to outliers in the feature space) during the estimation of whether a test instance is an outlier or not. In this article, we propose a r obust m ulti- k ernel nearest n eighborhood (RMKN) method for outlier detection. Specifically, in the training phase, we only consider normal instances and transform them into a Polynomial kernel function weighted digraph to capture their geometric relationships in the original feature space. Then, we develop an objective function based on the weighted digraph to find a latent feature space via multi-kernel learning such that distances among normal instances in this latent feature space are as close as possible while preserving their original distributions. In the detecting phase, we design an outlying score based on the two-stage multi-kernel [Formula Omitted]-nearest nearest neighbors to detect outliers. Extensive experiments with ten datasets show that RMKN is effective and robust. |
Author | He, Chengxin Duan, Lei Bao, Zhifeng Wang, Xinye Yu, Zhenyang |
Author_xml | – sequence: 1 givenname: Xinye orcidid: 0000-0003-2095-6117 surname: Wang fullname: Wang, Xinye email: wangxinye@stu.scu.edu.cn organization: School of Computer Science, Sichuan University, Chengdu, China – sequence: 2 givenname: Lei orcidid: 0000-0001-7254-1832 surname: Duan fullname: Duan, Lei email: leiduan@scu.edu.cn organization: School of Computer Science, Sichuan University, Chengdu, China – sequence: 3 givenname: Zhenyang orcidid: 0000-0001-9198-9851 surname: Yu fullname: Yu, Zhenyang email: yuzhenyang@stu.scu.edu.cn organization: School of Computer Science, Sichuan University, Chengdu, China – sequence: 4 givenname: Chengxin orcidid: 0000-0003-3759-3914 surname: He fullname: He, Chengxin email: hechengxin@stu.scu.edu.cn organization: School of Computer Science, Sichuan University, Chengdu, China – sequence: 5 givenname: Zhifeng orcidid: 0000-0003-2477-381X surname: Bao fullname: Bao, Zhifeng email: zhifeng.bao@rmit.edu.au organization: RMIT University, Melbourne, VIC, Australia |
BookMark | eNp9UE1Lw0AQXaSCbfUHCB4CnlP3s5u9CNLWD1pbkHpeNsnEbonZutkc_PduaA_iQRiYYWbevDdvhAaNawCha4InhGB1t13OFxOKKZ8wNuVEqjM0JEJkKSWKDGKNOUk54_ICjdp2jzHOZEaG6P7N5V0bkteuDjZdgm-gTtZgPMTmGuzHLnd-51yZVM4nmy7UFnwyhwBFsK65ROeVqVu4OuUxen9cbGfP6Wrz9DJ7WKUFVTykmTGK51ACFkoonuWZzAnHZSVp1MeLkst8WlBBGCfKSEKVkJJLqjDOBRaMjdHt8e7Bu68uatN71_kmUmqGZRaDMxm3yHGr8K5tPVT64O2n8d-aYN3bpHubdG-TPtkUMfIPprDB9L8Fb2z9L_LmiLQA8IuJMxan7AdZS3Tn |
CODEN | ITKEEH |
CitedBy_id | crossref_primary_10_3390_math12132048 crossref_primary_10_1007_s10115_024_02324_y |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TKDE.2024.3364179 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1558-2191 |
EndPage | 4231 |
ExternalDocumentID | 10_1109_TKDE_2024_3364179 10433793 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61972268 funderid: 10.13039/501100001809 – fundername: ARC grantid: DP220101434; DP240101211 |
GroupedDBID | -~X .DC 0R~ 1OL 29I 4.4 5GY 5VS 6IK 97E 9M8 AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNI RNS RXW RZB TAE TAF TN5 UHB VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c294t-8aa94bede0595948b87b140df722194cd47b6c2513419a7129577472900b50533 |
IEDL.DBID | RIE |
ISSN | 1041-4347 |
IngestDate | Mon Jun 30 03:55:47 EDT 2025 Wed Oct 01 02:06:31 EDT 2025 Thu Apr 24 23:04:06 EDT 2025 Wed Aug 27 02:05:20 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 8 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c294t-8aa94bede0595948b87b140df722194cd47b6c2513419a7129577472900b50533 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0001-9198-9851 0000-0003-3759-3914 0000-0003-2095-6117 0000-0003-2477-381X 0000-0001-7254-1832 |
PQID | 3078078437 |
PQPubID | 85438 |
PageCount | 12 |
ParticipantIDs | crossref_primary_10_1109_TKDE_2024_3364179 ieee_primary_10433793 proquest_journals_3078078437 crossref_citationtrail_10_1109_TKDE_2024_3364179 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2024-08-01 |
PublicationDateYYYYMMDD | 2024-08-01 |
PublicationDate_xml | – month: 08 year: 2024 text: 2024-08-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on knowledge and data engineering |
PublicationTitleAbbrev | TKDE |
PublicationYear | 2024 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
SSID | ssj0008781 |
Score | 2.4641695 |
Snippet | Outlier detection methods based on distance measure have been used in numerous applications due to their effectiveness and interpretability. However, distances... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 4220 |
SubjectTerms | Anomaly detection Data analysis Effectiveness Feature extraction Games Graph theory Kernel Kernel functions multi-kernel learning nearest neighborhood Outlier detection Outliers (statistics) Polynomials Robustness Support vector machines Task analysis Training weighted digraph |
Title | Robust Multi-Kernel Nearest Neighborhood for Outlier Detection |
URI | https://ieeexplore.ieee.org/document/10433793 https://www.proquest.com/docview/3078078437 |
Volume | 36 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2191 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0008781 issn: 1041-4347 databaseCode: RIE dateStart: 19890101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0IJz2IIkYUTQ-eTIqlu3TpxcQIhEiCiYGEW7NfXCTFQHvx1zuzLQQ1Gm9Ns9tudmZn3-zOvAG41bpnI9zZKLsdHRRlmC8Zl74h5yI2Ngy0C5CdRKMZf55352WyusuFsda64DPbpkd3l29WOqejMlzhnDFUqApUhIiLZK2d2e0JV5EU2-A_GRflFWYniO-n4_4AXcGQtxmLqOTWl03IVVX5YYrd_jKswWQ7siKs5K2dZ6qtP76RNv576CdwXCJN77FQjVM4sGkdatsqDl65qOtwtEdJeAYPryuVbzLPJeb6Y7tO7dKbENEtvpzQOSoqDVEhe4h2vZc8Qwy79vo2cyFdaQNmw8H0aeSXNRZ8HcY883tSxlxZYxFmEXOL6gmFPpdZiBBtGdeGCxVpBEHE-yYFooMuAkZE5EGgupTHew7VdJXaC_A6mvY2FaH1lBxRgZLSMGZkVythF6FtQrCd9ESXBORUB2OZOEckiBOSU0JySko5NeFu1-W9YN_4q3GD5n2vYTHlTWhtRZuUC3SToGkjpn3OxOUv3a7gkL5eBPu1oJqtc3uNACRTN07xPgHUVNOj |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLZ4HIADgwFiMKAHTkgdpUmb9YKEGNNgUCS0SdyqvHYBdWi0F349dtohHgJxq6pEiWLH-ZzYnwGOte7aGE82ym5HB0UZ5kvGpW_IuUiMDQPtAmTTeDDmN4_RY52s7nJhrLUu-Mx26NO95ZupLumqDHc4ZwwVahGWI3QrRJWu9WF4u8LVJMVWOCrjon7EPAuS09Gwd4XOYMg7jMVUdOvLMeTqqvwwxu6E6Tcgnc-tCix56pSF6ui3b7SN_578BqzXWNO7qJRjExZs3oTGvI6DV2_rJqx9IiXcgvOHqSpfC8-l5vpDO8vts5cS1S3-TOkmFdWGyJA9xLvefVkgip15PVu4oK58G8b9q9HlwK-rLPg6THjhd6VMuLLGItAi7hbVFQq9LjMRIVozrg0XKtYIg4j5TQrEBxFCRsTkQaAiyuTdgaV8mttd8M40nW4qRvspOeICJaVhzMhIK2EnoW1BMF_0TNcU5FQJ4zlzrkiQZCSnjOSU1XJqwclHl5eKf-Ovxtu07p8aVkvegvZctFm9RV8zNG7Etc-Z2Pul2xGsDEZ3t9ntdTrch1UaqQr9a8NSMSvtAcKRQh06JXwHIRnW9A |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Robust+Multi-Kernel+Nearest+Neighborhood+for+Outlier+Detection&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Wang%2C+Xinye&rft.au=Duan%2C+Lei&rft.au=Yu%2C+Zhenyang&rft.au=He%2C+Chengxin&rft.date=2024-08-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1041-4347&rft.eissn=1558-2191&rft.volume=36&rft.issue=8&rft.spage=4220&rft_id=info:doi/10.1109%2FTKDE.2024.3364179&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon |