Variational Bayes Ensemble Learning Neural Networks With Compressed Feature Space

We consider the problem of nonparametric classification from a high-dimensional input vector (small <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> large <inline-formula> <tex-math notation="LaTeX">p </tex-math>...

Full description

Saved in:
Bibliographic Details
Published inIEEE transaction on neural networks and learning systems Vol. 35; no. 1; pp. 1379 - 1385
Main Authors Liu, Zihuan, Bhattacharya, Shrijita, Maiti, Tapabrata
Format Journal Article
LanguageEnglish
Published United States IEEE 01.01.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2162-237X
2162-2388
2162-2388
DOI10.1109/TNNLS.2022.3172276

Cover

More Information
Summary:We consider the problem of nonparametric classification from a high-dimensional input vector (small <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> large <inline-formula> <tex-math notation="LaTeX">p </tex-math></inline-formula> problem). To handle the high-dimensional feature space, we propose a random projection (RP) of the feature space followed by training of a neural network (NN) on the compressed feature space. Unlike regularization techniques (lasso, ridge, etc.), which train on the full data, NNs based on compressed feature space have significantly lower computation complexity and memory storage requirements. Nonetheless, a random compression-based method is often sensitive to the choice of compression. To address this issue, we adopt a Bayesian model averaging (BMA) approach and leverage the posterior model weights to determine: 1) uncertainty under each compression and 2) intrinsic dimensionality of the feature space (the effective dimension of feature space useful for prediction). The final prediction is improved by averaging models with projected dimensions close to the intrinsic dimensionality. Furthermore, we propose a variational approach to the afore-mentioned BMA to allow for simultaneous estimation of both model weights and model-specific parameters. Since the proposed variational solution is parallelizable across compressions, it preserves the computational gain of frequentist ensemble techniques while providing the full uncertainty quantification of a Bayesian approach. We establish the asymptotic consistency of the proposed algorithm under the suitable characterization of the RPs and the prior parameters. Finally, we provide extensive numerical examples for empirical validation of the proposed method.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2022.3172276