Approximating K‐means‐type Clustering via Semidefinite Programming
One of the fundamental clustering problems is to assign $n$ points into $k$ clusters based on minimal sum-of-squared distances (MSSC), which is known to be NP-hard. In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 semidefinite programming (SDP) problem. We show that o...
Saved in:
Published in | SIAM journal on optimization Vol. 18; no. 1; pp. 186 - 205 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Philadelphia
Society for Industrial and Applied Mathematics
01.01.2007
|
Subjects | |
Online Access | Get full text |
ISSN | 1052-6234 1095-7189 |
DOI | 10.1137/050641983 |
Cover
Summary: | One of the fundamental clustering problems is to assign $n$ points into $k$ clusters based on minimal sum-of-squared distances (MSSC), which is known to be NP-hard. In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 semidefinite programming (SDP) problem. We show that our 0-1 SDP model provides a unified framework for several clustering approaches such as normalized k-cut and spectral clustering. Moreover, the 0-1 SDP model allows us to solve the underlying problem approximately via the linear programming and SDP relaxations. Second, we consider the issue of how to extract a feasible solution of the original 0-1 SDP model from the optimal solution of the relaxed SDP problem. By using principal component analysis, we develop a rounding procedure to construct a feasible partitioning from a solution of the relaxed problem. In our rounding procedure, we need to solve a K-means clustering problem in $\Re^{k-1}$, which can be done in $O(n^{k^2-2k+2})$ time. In case of biclustering, the running time of our rounding procedure can be reduced to $O(n\log n)$. We show that our algorithm provides a 2-approximate solution to the original problem. Promising numerical results for biclustering based on our new method are reported. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14 |
ISSN: | 1052-6234 1095-7189 |
DOI: | 10.1137/050641983 |