Optimal and exact recovery on general non-uniform Hypergraph Stochastic Block Model
Consider the community detection problem in random hypergraphs under the non-uniform hypergraph stochastic block model (HSBM), where each hyperedge appears independently with some given probability depending only on the labels of its vertices. We establish, for the first time in the literature, a sh...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
25.04.2023
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.2304.13139 |
Cover
Summary: | Consider the community detection problem in random hypergraphs under the
non-uniform hypergraph stochastic block model (HSBM), where each hyperedge
appears independently with some given probability depending only on the labels
of its vertices. We establish, for the first time in the literature, a sharp
threshold for exact recovery under this non-uniform case, subject to minor
constraints; in particular, we consider the model with multiple communities.
One crucial point here is that by aggregating information from all the uniform
layers, we may obtain exact recovery even in cases when this may appear
impossible if each layer were considered alone. Besides that, we prove a
wide-ranging, information-theoretic lower bound on the number of misclassified
vertices \emph{for any algorithm}, depending on a \emph{generalized
Chernoff-Hellinger} divergence involving model parameters. We provide two
efficient algorithms which successfully achieve exact recovery when above the
threshold, and attain the lowest possible mismatch ratio when the exact
recovery is impossible, proved to be optimal. The theoretical analysis of our
algorithms relies on the concentration and regularization of the adjacency
matrix for non-uniform random hypergraphs, which could be of independent
interest. We also address some open problems regarding parameter knowledge and
estimation. |
---|---|
DOI: | 10.48550/arxiv.2304.13139 |