Robust principal component analysis of electromagnetic arrays with missing data

SUMMARY We describe a new algorithm for robust principal component analysis (PCA) of electromagnetic (EM) array data, extending previously developed multivariate methods to include arrays with large data gaps, and only partial overlap between site occupations. Our approach is based on a criss‐cross...

Full description

Saved in:
Bibliographic Details
Published inGeophysical journal international Vol. 190; no. 3; pp. 1423 - 1438
Main Authors Smirnov, M. Yu, Egbert, G. D.
Format Journal Article
LanguageEnglish
Published Oxford, UK Blackwell Publishing Ltd 01.09.2012
Subjects
Online AccessGet full text
ISSN0956-540X
1365-246X
1365-246X
DOI10.1111/j.1365-246X.2012.05569.x

Cover

More Information
Summary:SUMMARY We describe a new algorithm for robust principal component analysis (PCA) of electromagnetic (EM) array data, extending previously developed multivariate methods to include arrays with large data gaps, and only partial overlap between site occupations. Our approach is based on a criss‐cross regression scheme in which polarization parameters and spatial modes are alternately estimated with robust regression procedures. The basic scheme can be viewed as an expectation robust (ER) algorithm, of the sort that has been widely discussed in the statistical literature in the context of robust PCA, but with details of the scheme tailored to the physical specifics of EM array observations. We have tested our algorithm with synthetic and real data, including data denial experiments where we have created artificial gaps, and compared results obtained with full and incomplete data arrays. These tests reveal that for modest amounts of missing data (up to 20 per cent or so) the algorithm performs well, reproducing essentially the same dominant spatial modes that would be obtained from analysis of the complete array. The algorithm thus makes multivariate analysis practical for the first time for large heterogeneous arrays, as we illustrate by application to two different EM arrays.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0956-540X
1365-246X
1365-246X
DOI:10.1111/j.1365-246X.2012.05569.x