Forward variable selection enables fast and accurate dynamic system identification with Karhunen-Loève decomposed Gaussian processes

A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Loève (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 19; no. 9; p. e0309661
Main Authors	Hayes, Kyle, Fouts, Michael W., Baheri, Ali, Mebane, David S.
Format	Journal Article
Language	English
Published	United States Public Library of Science 20.09.2024 Public Library of Science (PLoS)
Subjects	Accuracy Algorithms Analysis Approximation Artificial neural networks Basis functions Bayes Theorem Bayesian analysis Data mining Datasets Decomposition Dynamical systems Eigenvectors Evaluation Feature selection Gaussian process Gaussian processes Inference Machine learning Neural networks Normal Distribution Optimization Recurrent neural networks System identification Training Variance analysis United States
Online Access	Get full text
ISSN	1932-6203 1932-6203
DOI	10.1371/journal.pone.0309661

Cover

More Information
Summary:	A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Loève (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection thus becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. Theoretical computational complexities are O ( N P 2 ) in training and O ( P ) per point in inference, where N is the number of instances and P the number of expansion terms. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a ‘Susceptible, Infected, Recovered’ (SIR) toy problem, along with the experimental ‘Cascaded Tanks’ benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with the SINDy package.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 USDOE Subcontract No. P010220883 Task 30
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0309661