Cellwise outlier detection and biomarker identification in metabolomics based on pairwise log ratios

Data outliers can carry very valuable information and might be most informative for the interpretation. Nevertheless, they are often neglected. An algorithm called cellwise outlier diagnostics using robust pairwise log ratios (cell‐rPLR) for the identification of outliers in single cell of a data ma...

Full description

Saved in:
Bibliographic Details
Published inJournal of chemometrics Vol. 34; no. 1; pp. e3182 - n/a
Main Authors Walach, Jan, Filzmoser, Peter, Kouřil, Štěpán, Friedecký, David, Adam, Tomáš
Format Journal Article
LanguageEnglish
Published England Wiley Subscription Services, Inc 01.01.2020
John Wiley and Sons Inc
Subjects
Online AccessGet full text
ISSN0886-9383
1099-128X
DOI10.1002/cem.3182

Cover

More Information
Summary:Data outliers can carry very valuable information and might be most informative for the interpretation. Nevertheless, they are often neglected. An algorithm called cellwise outlier diagnostics using robust pairwise log ratios (cell‐rPLR) for the identification of outliers in single cell of a data matrix is proposed. The algorithm is designed for metabolomic data, where due to the size effect, the measured values are not directly comparable. Pairwise log ratios between the variable values form the elemental information for the algorithm, and the aggregation of appropriate outlyingness values results in outlyingness information. A further feature of cell‐rPLR is that it is useful for biomarker identification, particularly in the presence of cellwise outliers. Real data examples and simulation studies underline the good performance of this algorithm in comparison with alternative methods. Data outliers can carry very valuable information and might be most informative for the interpretation. Nevertheless, they are often neglected. An algorithm cell‐rPLR for the identification of outliers in singlecell of a data matrix is proposed. Pairwise log‐ratios between the variablevalues form the elemental information for the algorithm, and the aggregation of appropriate outlyingness values results in outlyingness information. Furthermore,cell‐rPLR is useful for biomarker identification, particularly in the presence of cellwise outliers.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0886-9383
1099-128X
DOI:10.1002/cem.3182