A general algorithm for covariance modeling of discrete data

We propose an algorithm that generalizes to discrete data any given covariance modeling algorithm originally intended for Gaussian responses, via a Gaussian copula approach. Covariance modeling is a powerful tool for extracting meaning from multivariate data, and fast algorithms for Gaussian data, s...

Full description

Saved in:
Bibliographic Details
Published inJournal of multivariate analysis Vol. 165; pp. 86 - 100
Main Authors Popovic, Gordana C., Hui, Francis K.C., Warton, David I.
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.05.2018
Subjects
Online AccessGet full text
ISSN0047-259X
1095-7243
1095-7243
DOI10.1016/j.jmva.2017.12.002

Cover

More Information
Summary:We propose an algorithm that generalizes to discrete data any given covariance modeling algorithm originally intended for Gaussian responses, via a Gaussian copula approach. Covariance modeling is a powerful tool for extracting meaning from multivariate data, and fast algorithms for Gaussian data, such as factor analysis and Gaussian graphical models, are widely available. Our algorithm makes these tools generally available to analysts of discrete data and can combine any likelihood-based covariance modeling method for Gaussian data with any set of discrete marginal distributions. Previously, tools for discrete data were generally specific to one family of distributions or covariance modeling paradigm, or otherwise did not exist. Our algorithm is more flexible than alternate methods, takes advantage of existing fast algorithms for Gaussian data, and simulations suggest that it outperforms competing graphical modeling and factor analysis procedures for count and binomial data. We additionally show that in a Gaussian copula graphical model with discrete margins, conditional independence relationships in the latent Gaussian variables are inherited by the discrete observations. Our method is illustrated with a graphical model and factor analysis on an overdispersed ecological count dataset of species abundances.
ISSN:0047-259X
1095-7243
1095-7243
DOI:10.1016/j.jmva.2017.12.002