Knowledge-based data analysis comes of age
The emergence of high-throughput technologies for measuring biological systems has introduced problems for data interpretation that must be addressed for proper inference. First, analysis techniques need to be matched to the biological system, reflecting in their mathematical structure the underlyin...
Saved in:
| Published in | Briefings in bioinformatics Vol. 11; no. 1; pp. 30 - 39 |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | English |
| Published |
England
Oxford University Press
01.01.2010
Oxford Publishing Limited (England) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1467-5463 1477-4054 1477-4054 |
| DOI | 10.1093/bib/bbp044 |
Cover
| Summary: | The emergence of high-throughput technologies for measuring biological systems has introduced problems for data interpretation that must be addressed for proper inference. First, analysis techniques need to be matched to the biological system, reflecting in their mathematical structure the underlying behavior being studied. When this is not done, mathematical techniques will generate answers, but the values and reliability estimates may not accurately reflect the biology. Second, analysis approaches must address the vast excess in variables measured (e.g. transcript levels of genes) over the number of samples (e.g. tumors, time points), known as the 'large-p, small-n' problem. In large-p, small-n paradigms, standard statistical techniques generally fail, and computational learning algorithms are prone to overfit the data. Here we review the emergence of techniques that match mathematical structure to the biology, the use of integrated data and prior knowledge to guide statistical analysis, and the recent emergence of analysis approaches utilizing simple biological models. We show that novel biological insights have been gained using these techniques. |
|---|---|
| Bibliography: | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 1467-5463 1477-4054 1477-4054 |
| DOI: | 10.1093/bib/bbp044 |