Duplicated record detection based on improved RBF neural network
This paper presents a method based on modified Radial Basis Function(RBF) neural network to improve the accuracy and recall rate for detection of duplicated records. Firstly, key fields of records are clustered by Density-Based Spatial Clustering of Applications with Noise(DBSCAN) and all records ar...
Saved in:
| Published in | 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) pp. 2034 - 2037 |
|---|---|
| Main Authors | , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
01.03.2017
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/IAEAC.2017.8054373 |
Cover
| Abstract | This paper presents a method based on modified Radial Basis Function(RBF) neural network to improve the accuracy and recall rate for detection of duplicated records. Firstly, key fields of records are clustered by Density-Based Spatial Clustering of Applications with Noise(DBSCAN) and all records are classified to several classes which include duplicated records. Secondly, the similarity of corresponding fields of records in each class is computed using Jaro algorithm and duplicated records are labeled manually. Finally, Subtractive Clustering Method (SCM) and Particle Swarm Algorithm (PSO) are used to optimize the parameters of RBF neural network so that monitoring model of duplicated records is built. This method is tested with different datasets. The experimental results show that the accuracy and recall rate for the detection of duplicated records are improved significantly. |
|---|---|
| AbstractList | This paper presents a method based on modified Radial Basis Function(RBF) neural network to improve the accuracy and recall rate for detection of duplicated records. Firstly, key fields of records are clustered by Density-Based Spatial Clustering of Applications with Noise(DBSCAN) and all records are classified to several classes which include duplicated records. Secondly, the similarity of corresponding fields of records in each class is computed using Jaro algorithm and duplicated records are labeled manually. Finally, Subtractive Clustering Method (SCM) and Particle Swarm Algorithm (PSO) are used to optimize the parameters of RBF neural network so that monitoring model of duplicated records is built. This method is tested with different datasets. The experimental results show that the accuracy and recall rate for the detection of duplicated records are improved significantly. |
| Author | Li, Bo Chen, Mingyao Liu, Xinting Cai, Xiaodong |
| Author_xml | – sequence: 1 givenname: Xinting surname: Liu fullname: Liu, Xinting organization: School of Computer and Information Security, Guilin University of Electronic Technology, China – sequence: 2 givenname: Xiaodong surname: Cai fullname: Cai, Xiaodong email: caixiaodong@guet.edu.cn organization: School of Computer and Information Security, Guilin University of Electronic Technology, China – sequence: 3 givenname: Bo surname: Li fullname: Li, Bo organization: School of Computer and Information Security, Guilin University of Electronic Technology, China – sequence: 4 givenname: Mingyao surname: Chen fullname: Chen, Mingyao organization: Guilin Topintelligent Communication Technology Co., Ltd, China |
| BookMark | eNotj8FKAzEYhCPowVZfQC_7Arsmm-wm_811bbVQEKQHbyWb_IHgdrOkqeLbG2hP8zEww8yCXE9hQkIeGK0Yo_C06VZdX9WUyUrRRnDJr8iCiVZyBRK-bsnz62kevdEJbRHRhGgLiwlN8mEqBn3MdgZ_mGP4yfz5si4mPEU9Zkm_IX7fkRunxyPeX3RJduvVrn8vtx9vm77blh5oKkEISUEgQ0Ebi3wQDWWsRgWIBoQ1Ls8DpWvXUsUHZ5XLEQDJTN0OoPiSPJ5rPSLu5-gPOv7tL5_4Pyj9RWY |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/IAEAC.2017.8054373 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore digital library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 146738979X 9781467389792 |
| EndPage | 2037 |
| ExternalDocumentID | 8054373 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-i90t-9447094e1e405de3b450112e89eec94dcf80598a2f6083bfd8f9449971c26b983 |
| IEDL.DBID | RIE |
| IngestDate | Thu Jun 29 18:38:06 EDT 2023 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i90t-9447094e1e405de3b450112e89eec94dcf80598a2f6083bfd8f9449971c26b983 |
| PageCount | 4 |
| ParticipantIDs | ieee_primary_8054373 |
| PublicationCentury | 2000 |
| PublicationDate | 2017-March |
| PublicationDateYYYYMMDD | 2017-03-01 |
| PublicationDate_xml | – month: 03 year: 2017 text: 2017-March |
| PublicationDecade | 2010 |
| PublicationTitle | 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) |
| PublicationTitleAbbrev | IAEAC |
| PublicationYear | 2017 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.6622632 |
| Snippet | This paper presents a method based on modified Radial Basis Function(RBF) neural network to improve the accuracy and recall rate for detection of duplicated... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 2034 |
| SubjectTerms | Classification algorithms Clustering algorithms Clustering methods complex system Duplicated records Computational modeling Conferences Neural networks parameter optimizing Particle swarm optimization PSO RBF neural network SCM |
| Title | Duplicated record detection based on improved RBF neural network |
| URI | https://ieeexplore.ieee.org/document/8054373 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEB3anjyptOI3OXg02_3KJrlZa0sVKiIVeiubZBaKshXZvfjrTbLbiuLBU4aQkIQQ5mXy3gTgSoa5UjyJqWJC0jSPGJUcDQ05NwlGzEJ0J06eP2azl_RhyZYduN5pYRDRk88wcKZ_yzcbXbtQ2VBYfJHwpAtdLrJGq7XVwYRyeD-ajMaOrMWDtuGPH1O8w5juw3w7VMMTeQ3qSgX681cWxv_O5QAG39I88rRzOofQwbIPN3d18wyNhjRRF2Kw8iSrkjg_ZYg11j5-YO3n2ylxeSzzN1t4FvgAFtPJYjyj7dcIdC3Diso05fZehhFavGUwUSmz5zRGIRG1TI0u7PSkyOMisxBLFUYUtouUPNJxpqRIjqBXbko8BmIRiQq1O8fa5amJZSSw0CwXKlYomTmBvlv86r1JfrFq1336d_UZ7LkNaEha59CrPmq8sF67Upd-u74AAyqYVg |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NTwIxEJ0gHvSkBozf7sGju-xHS9ubiBBQIMZgwo1s29mEYBZjdi_-etsuYDQePHXStGmbppnX6XtTgBsRplKyJPYl5cInaUR9wVD7IWM6wYgaiG7FyeNJe_BKHmd0VoPbrRYGER35DANrurd8vVKlDZW1uMEXCUt2YJcSQmil1tooYULRGnZ6na6la7Fg3fTHnynOZfQPYLwZrGKKLIOykIH6_JWH8b-zOYTmtzjPe966nSOoYd6Au4eyeohG7VVxF09j4WhWuWc9lfaMsXARBGO_3Pc9m8kyfTOF44E3YdrvTbsDf_05gr8QYeELQpi5mWGEBnFpTCSh5qTGyAWiEkSrzExP8DTO2gZkyUzzzHQRgkUqbkvBk2Oo56scT8AzmESGyp5kZTPVxCLimCmachlLFFSfQsMufv5epb-Yr9d99nf1NewNpuPRfDScPJ3Dvt2MirJ1AfXio8RL48MLeeW27gsvQpuj |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2017+IEEE+2nd+Advanced+Information+Technology%2C+Electronic+and+Automation+Control+Conference+%28IAEAC%29&rft.atitle=Duplicated+record+detection+based+on+improved+RBF+neural+network&rft.au=Liu%2C+Xinting&rft.au=Cai%2C+Xiaodong&rft.au=Li%2C+Bo&rft.au=Chen%2C+Mingyao&rft.date=2017-03-01&rft.pub=IEEE&rft.spage=2034&rft.epage=2037&rft_id=info:doi/10.1109%2FIAEAC.2017.8054373&rft.externalDocID=8054373 |