Duplicated record detection based on improved RBF neural network
This paper presents a method based on modified Radial Basis Function(RBF) neural network to improve the accuracy and recall rate for detection of duplicated records. Firstly, key fields of records are clustered by Density-Based Spatial Clustering of Applications with Noise(DBSCAN) and all records ar...
Saved in:
| Published in | 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) pp. 2034 - 2037 |
|---|---|
| Main Authors | , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
01.03.2017
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/IAEAC.2017.8054373 |
Cover
| Summary: | This paper presents a method based on modified Radial Basis Function(RBF) neural network to improve the accuracy and recall rate for detection of duplicated records. Firstly, key fields of records are clustered by Density-Based Spatial Clustering of Applications with Noise(DBSCAN) and all records are classified to several classes which include duplicated records. Secondly, the similarity of corresponding fields of records in each class is computed using Jaro algorithm and duplicated records are labeled manually. Finally, Subtractive Clustering Method (SCM) and Particle Swarm Algorithm (PSO) are used to optimize the parameters of RBF neural network so that monitoring model of duplicated records is built. This method is tested with different datasets. The experimental results show that the accuracy and recall rate for the detection of duplicated records are improved significantly. |
|---|---|
| DOI: | 10.1109/IAEAC.2017.8054373 |