k-Mnv-Rep: A k-type clustering algorithm for matrix-object data
•We define a novel dissimilarity measure between two numeric matrix-objects.•We provide an update policy of cluster centers for numeric matrix-object data.•We propose the k-Mnv-Rep algorithm to cluster numeric matrix-object data.•We propose the k-Mv-Rep algorithm to cluster hybrid matrix-object data...
Saved in:
| Published in | Information sciences Vol. 542; pp. 40 - 57 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier Inc
04.01.2021
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0020-0255 1872-6291 |
| DOI | 10.1016/j.ins.2020.06.071 |
Cover
| Summary: | •We define a novel dissimilarity measure between two numeric matrix-objects.•We provide an update policy of cluster centers for numeric matrix-object data.•We propose the k-Mnv-Rep algorithm to cluster numeric matrix-object data.•We propose the k-Mv-Rep algorithm to cluster hybrid matrix-object data.
In matrix-object data, an object (or a sample) is described by more than one feature vector (record) and all of those feature vectors are responsible for the observed classification of the object. A task for matrix-object data is to cluster it into a set of groups by analyzing and utilizing the information of feature vectors. Matrix-object data are widespread in many real applications. Previous studies typically address data sets that an object is generally represented by a feature vector, which may be violated in many real-world tasks. In this paper, we propose a k-multi-numeric-values-representatives (abbr. k-Mnv-Rep) algorithm to cluster numeric matrix-object data. In this algorithm, a new dissimilarity measure between two numeric matrix-objects is defined and a new heuristic method of updating cluster centers is given. Furthermore, we also propose a k-multi-values-representatives (abbr. k-Mv-Rep) algorithm to cluster hybrid matrix-object data. The two proposed algorithms break the limitations of the previous studies, and can be applied to address matrix-object data sets that exist widely in many real-world tasks. The benefits and effectiveness of the two algorithms are shown by some experiments on real and synthetic data sets. |
|---|---|
| ISSN: | 0020-0255 1872-6291 |
| DOI: | 10.1016/j.ins.2020.06.071 |