Duplicated record detection based on improved RBF neural network

This paper presents a method based on modified Radial Basis Function(RBF) neural network to improve the accuracy and recall rate for detection of duplicated records. Firstly, key fields of records are clustered by Density-Based Spatial Clustering of Applications with Noise(DBSCAN) and all records ar...

Full description

Saved in:
Bibliographic Details
Published in2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) pp. 2034 - 2037
Main Authors Liu, Xinting, Cai, Xiaodong, Li, Bo, Chen, Mingyao
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2017
Subjects
Online AccessGet full text
DOI10.1109/IAEAC.2017.8054373

Cover

More Information
Summary:This paper presents a method based on modified Radial Basis Function(RBF) neural network to improve the accuracy and recall rate for detection of duplicated records. Firstly, key fields of records are clustered by Density-Based Spatial Clustering of Applications with Noise(DBSCAN) and all records are classified to several classes which include duplicated records. Secondly, the similarity of corresponding fields of records in each class is computed using Jaro algorithm and duplicated records are labeled manually. Finally, Subtractive Clustering Method (SCM) and Particle Swarm Algorithm (PSO) are used to optimize the parameters of RBF neural network so that monitoring model of duplicated records is built. This method is tested with different datasets. The experimental results show that the accuracy and recall rate for the detection of duplicated records are improved significantly.
DOI:10.1109/IAEAC.2017.8054373