Ten simple rules for initial data analysis

Typically, researchers do not perform IDA in a systematic way, if at all, or mix IDA activities with subsequent data analysis tasks such as hypothesis generation or exploration, formal analysis, and interpretation of conclusions. The value of an effective IDA strategy for researchers lies in ensurin...

Full description

Saved in:
Bibliographic Details
Published inPLoS computational biology Vol. 18; no. 2; p. e1009819
Main Authors Baillie, Mark, le Cessie, Saskia, Schmidt, Carsten Oliver, Lusa, Lara, Huebner, Marianne
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 01.02.2022
Public Library of Science (PLoS)
Subjects
Online AccessGet full text
ISSN1553-7358
1553-734X
1553-7358
DOI10.1371/journal.pcbi.1009819

Cover

Abstract Typically, researchers do not perform IDA in a systematic way, if at all, or mix IDA activities with subsequent data analysis tasks such as hypothesis generation or exploration, formal analysis, and interpretation of conclusions. The value of an effective IDA strategy for researchers lies in ensuring that data are of sufficient quality, that model assumptions made in the SAP are satisfied, or to support decisions for the statistical analyses (and are adequately documented). IDA requires domain knowledge, especially researchers with an understanding of why and how the data was measured and collected, expertise in data management and stewardship, competencies in planning and implementing data analysis, and experience of scientific computing practices. Make IDA reproducible IDA is a crucial part of the research pipeline, and as such, it should be well documented to promote transparency, utility, and reproducibility. [...]keeping track of changes that you and your collaborators make to project data, programs (including analysis scripts, libraries, and packages), and documentation (including plans and reports) is a key IDA practice [15].
AbstractList Typically, researchers do not perform IDA in a systematic way, if at all, or mix IDA activities with subsequent data analysis tasks such as hypothesis generation or exploration, formal analysis, and interpretation of conclusions. The value of an effective IDA strategy for researchers lies in ensuring that data are of sufficient quality, that model assumptions made in the SAP are satisfied, or to support decisions for the statistical analyses (and are adequately documented). IDA requires domain knowledge, especially researchers with an understanding of why and how the data was measured and collected, expertise in data management and stewardship, competencies in planning and implementing data analysis, and experience of scientific computing practices. Make IDA reproducible IDA is a crucial part of the research pipeline, and as such, it should be well documented to promote transparency, utility, and reproducibility. [...]keeping track of changes that you and your collaborators make to project data, programs (including analysis scripts, libraries, and packages), and documentation (including plans and reports) is a key IDA practice [15].
Typically, researchers do not perform IDA in a systematic way, if at all, or mix IDA activities with subsequent data analysis tasks such as hypothesis generation or exploration, formal analysis, and interpretation of conclusions. The value of an effective IDA strategy for researchers lies in ensuring that data are of sufficient quality, that model assumptions made in the SAP are satisfied, or to support decisions for the statistical analyses (and are adequately documented). IDA requires domain knowledge, especially researchers with an understanding of why and how the data was measured and collected, expertise in data management and stewardship, competencies in planning and implementing data analysis, and experience of scientific computing practices. Make IDA reproducible IDA is a crucial part of the research pipeline, and as such, it should be well documented to promote transparency, utility, and reproducibility. [...]keeping track of changes that you and your collaborators make to project data, programs (including analysis scripts, libraries, and packages), and documentation (including plans and reports) is a key IDA practice [15].
Audience Academic
Author Schmidt, Carsten Oliver
Huebner, Marianne
Baillie, Mark
Lusa, Lara
le Cessie, Saskia
AuthorAffiliation 5 Department of Statistics and Probability, Michigan State University, East Lansing, Michigan, United States of America
4 Department of Mathematics, Faculty of Mathematics, Natural Sciences and Information Technology, University of Primorska, Koper, Slovenia
1 Novartis, Basel, Switzerland
2 Department of Clinical Epidemiology and Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, Netherlands
3 Institute for Community Medicine, SHIP-KEF University Medicine of Greifswald, Greifswald, Germany
AuthorAffiliation_xml – name: 4 Department of Mathematics, Faculty of Mathematics, Natural Sciences and Information Technology, University of Primorska, Koper, Slovenia
– name: 5 Department of Statistics and Probability, Michigan State University, East Lansing, Michigan, United States of America
– name: 1 Novartis, Basel, Switzerland
– name: 3 Institute for Community Medicine, SHIP-KEF University Medicine of Greifswald, Greifswald, Germany
– name: 2 Department of Clinical Epidemiology and Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, Netherlands
Author_xml – sequence: 1
  givenname: Mark
  orcidid: 0000-0002-5618-0667
  surname: Baillie
  fullname: Baillie, Mark
– sequence: 2
  givenname: Saskia
  surname: le Cessie
  fullname: le Cessie, Saskia
– sequence: 3
  givenname: Carsten Oliver
  orcidid: 0000-0001-5266-9396
  surname: Schmidt
  fullname: Schmidt, Carsten Oliver
– sequence: 4
  givenname: Lara
  orcidid: 0000-0002-8981-2421
  surname: Lusa
  fullname: Lusa, Lara
– sequence: 5
  givenname: Marianne
  orcidid: 0000-0002-9694-9231
  surname: Huebner
  fullname: Huebner, Marianne
BackLink https://www.ncbi.nlm.nih.gov/pubmed/35202399$$D View this record in MEDLINE/PubMed
BookMark eNqVkl2L1DAUhousuB_6D0QL3qzCjEnz0cQLYVn8GFgUdL0Op0k6ZkibMWnF_febOh3ZWUSQXjScPO97cg7vaXHUh94WxVOMlpjU-PUmjLEHv9zqxi0xQlJg-aA4wYyRRU2YOLpzPi5OU9oglI-SPyqOCatQRaQ8KV5d275Mrtt6W8bR21S2IZaud4MDXxoYoITc5Sa59Lh42IJP9sn8Pyu-vX93fflxcfX5w-ry4mqhOSHDogXOpagEtAYk11zW2piG1IIh3DAmCEcSswpzIwk12AhMNEOVaChoKS0iZ8Xzne_Wh6TmMZOqOEU4P5vRTKx2hAmwUdvoOog3KoBTvwshrhXEwWlvFbaao7ZGmBpJK2gE5k0DdUO1zlUqs9fbudvYdNZo2w8R_IHp4U3vvqt1-KmEqBHDVTY4nw1i-DHaNKjOJW29h96GcXo3IYIKXk29XtxD_z7dcketIQ_g-jbkvjp_xnZO5xC0LtcvuGQ0r7ISWfDyQJCZwf4a1jCmpFZfv_wH--mQfXZ3NX92sk9PBt7sAB1DStG2SrsBBhemTTmvMFJTVPdTqimqao5qFtN74r3_P2W3W_br4A
CitedBy_id crossref_primary_10_1136_bmjopen_2022_066189
crossref_primary_10_1177_0272989X251326069
crossref_primary_10_1177_25152459241297674
crossref_primary_10_1515_stat_2022_0110
crossref_primary_10_1002_pst_2463
crossref_primary_10_3390_ijerph191811260
crossref_primary_10_1186_s13040_023_00326_0
crossref_primary_10_3389_fdgth_2022_932599
crossref_primary_10_1371_journal_pcbi_1010718
crossref_primary_10_14746_rpeis_2024_86_4_14
crossref_primary_10_1016_j_bonr_2023_101730
crossref_primary_10_1002_pst_2368
crossref_primary_10_1016_j_jclinepi_2024_111342
crossref_primary_10_3390_a17030112
crossref_primary_10_1088_1742_6596_2839_1_012019
crossref_primary_10_1186_s12874_024_02294_3
crossref_primary_10_3390_su162411086
crossref_primary_10_1016_j_jval_2023_12_002
crossref_primary_10_1186_s12875_024_02667_z
crossref_primary_10_1016_j_jclinepi_2024_111605
crossref_primary_10_1002_sta4_644
crossref_primary_10_1371_journal_pone_0295726
crossref_primary_10_47836_pjssh_32_S3_09
crossref_primary_10_3390_biomedinformatics2030028
crossref_primary_10_1371_journal_pcbi_1010749
Cites_doi 10.1371/journal.pcbi.1005510
10.1038/sdata.2016.18
10.1016/j.jclinepi.2021.01.008
10.1371/journal.pcbi.1003285
10.1080/00031305.1998.10480528
10.1207/s15327957pspr0203_4
10.1186/s12874-020-00942-y
10.1136/bmj.308.6924.283
10.1038/520612a
10.1353/obs.2018.0014
10.1186/1741-7015-8-24
10.1186/s12874-021-01252-7
10.1371/journal.pcbi.1004961
10.2307/2981525
10.1186/s13059-020-02133-w
10.3389/fpsyg.2016.01832
10.2307/2981969
10.1002/psp4.12455
ContentType Journal Article
Copyright COPYRIGHT 2022 Public Library of Science
2022 Baillie et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
2022 Baillie et al 2022 Baillie et al
Copyright_xml – notice: COPYRIGHT 2022 Public Library of Science
– notice: 2022 Baillie et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: 2022 Baillie et al 2022 Baillie et al
CorporateAuthor for the Topic Group “Initial Data Analysis” of the STRATOS Initiative
Topic Group “Initial Data Analysis” of the STRATOS Initiative
CorporateAuthor_xml – name: for the Topic Group “Initial Data Analysis” of the STRATOS Initiative
– name: Topic Group “Initial Data Analysis” of the STRATOS Initiative
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
ISN
ISR
3V.
7QO
7QP
7TK
7TM
7X7
7XB
88E
8AL
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
HCIFZ
JQ2
K7-
K9.
LK8
M0N
M0S
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
RC3
7X8
5PM
DOA
DOI 10.1371/journal.pcbi.1009819
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Gale In Context: Canada
Gale In Context: Science
ProQuest Central (Corporate)
Biotechnology Research Abstracts
Calcium & Calcified Tissue Abstracts
Neurosciences Abstracts
Nucleic Acids Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Collection
ProQuest Hospital Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
Biological Science Database (Proquest)
ProQuest Central
Technology Collection
Natural Science Collection
ProQuest One Community College
ProQuest Central
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Health & Medical Complete (Alumni)
Biological Sciences
Computing Database
Health & Medical Collection (Alumni)
Proquest Medical Database
Biological Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
ProQuest One Academic
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ - Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Nucleic Acids Abstracts
SciTech Premium Collection
ProQuest Central China
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Neurosciences Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
Calcium & Calcified Tissue Abstracts
ProQuest One Academic (New)
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Genetics Abstracts
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest Medical Library
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 4
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
DocumentTitleAlternate Initial data analysis
EISSN 1553-7358
ExternalDocumentID 2640120254
oai_doaj_org_article_1ec60f7014d942ab816bba7b4ccf7049
PMC8870512
A695460928
35202399
10_1371_journal_pcbi_1009819
Genre Editorial
Commentary
GeographicLocations United States
GeographicLocations_xml – name: United States
GroupedDBID ---
123
29O
2WC
53G
5VS
7X7
88E
8FE
8FG
8FH
8FI
8FJ
AAFWJ
AAKPC
AAUCC
AAWOE
AAYXX
ABDBF
ABUWG
ACGFO
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
AEAQA
AENEX
AEUYN
AFKRA
AFPKN
AFRAH
AHMBA
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AOIJS
ARAPS
AZQEC
B0M
BAWUL
BBNVY
BCNDV
BENPR
BGLVJ
BHPHI
BPHCQ
BVXVI
BWKFM
CCPQU
CITATION
CS3
DIK
DWQXO
E3Z
EAP
EAS
EBD
EBS
EJD
EMK
EMOBN
ESX
F5P
FPL
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
HCIFZ
HMCUK
HYE
IAO
IGS
INH
INR
ISN
ISR
ITC
J9A
K6V
K7-
KQ8
LK8
M1P
M48
M7P
O5R
O5S
OK1
OVT
P2P
P62
PHGZM
PHGZT
PIMPY
PQQKQ
PROAC
PSQYO
PV9
RNS
RPM
RZL
SV3
TR2
TUS
UKHRP
WOW
XSB
~8M
ADRAZ
C1A
CGR
CUY
CVF
ECM
EIF
H13
IPNFZ
NPM
RIG
WOQ
PMFND
3V.
7QO
7QP
7TK
7TM
7XB
8AL
8FD
8FK
FR3
JQ2
K9.
M0N
P64
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQUKI
PRINS
Q9U
RC3
7X8
PUEGO
5PM
-
AAPBV
ABPTK
ADACO
BBAFP
M~E
ID FETCH-LOGICAL-c633t-fa669828afda96c697cddb378501b558360915216d934d1d813c5028b4ac99e03
IEDL.DBID M48
ISSN 1553-7358
1553-734X
IngestDate Sun Apr 03 16:02:37 EDT 2022
Wed Aug 27 01:12:57 EDT 2025
Thu Aug 21 13:43:21 EDT 2025
Thu Sep 04 18:35:52 EDT 2025
Fri Jul 25 10:53:21 EDT 2025
Tue Jun 10 20:28:27 EDT 2025
Fri Jun 27 05:02:21 EDT 2025
Fri Jun 27 03:57:54 EDT 2025
Thu Apr 03 07:05:35 EDT 2025
Thu Apr 24 22:59:46 EDT 2025
Tue Jul 01 04:07:13 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Creative Commons Attribution License
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c633t-fa669828afda96c697cddb378501b558360915216d934d1d813c5028b4ac99e03
Notes SourceType-Scholarly Journals-1
content type line 14
ObjectType-Editorial-2
ObjectType-Commentary-1
content type line 23
Membership of the STRATOS Initiative is provided in the Acknowledgments.
The authors have declared that no competing interests exist.
ORCID 0000-0002-5618-0667
0000-0002-9694-9231
0000-0002-8981-2421
0000-0001-5266-9396
OpenAccessLink https://doaj.org/article/1ec60f7014d942ab816bba7b4ccf7049
PMID 35202399
PQID 2640120254
PQPubID 1436340
ParticipantIDs plos_journals_2640120254
doaj_primary_oai_doaj_org_article_1ec60f7014d942ab816bba7b4ccf7049
pubmedcentral_primary_oai_pubmedcentral_nih_gov_8870512
proquest_miscellaneous_2633848629
proquest_journals_2640120254
gale_infotracacademiconefile_A695460928
gale_incontextgauss_ISR_A695460928
gale_incontextgauss_ISN_A695460928
pubmed_primary_35202399
crossref_citationtrail_10_1371_journal_pcbi_1009819
crossref_primary_10_1371_journal_pcbi_1009819
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-02-01
PublicationDateYYYYMMDD 2022-02-01
PublicationDate_xml – month: 02
  year: 2022
  text: 2022-02-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Francisco
– name: San Francisco, CA USA
PublicationTitle PLoS computational biology
PublicationTitleAlternate PLoS Comput Biol
PublicationYear 2022
Publisher Public Library of Science
Public Library of Science (PLoS)
Publisher_xml – name: Public Library of Science
– name: Public Library of Science (PLoS)
References I Simera (pcbi.1009819.ref026) 2010; 8
TH Davenport (pcbi.1009819.ref002) 2012; 90
JA Nelder (pcbi.1009819.ref005) 1986; 149
RJA Little (pcbi.1009819.ref025) 2019
I Yanai (pcbi.1009819.ref009) 2020; 21
CO Schmidt (pcbi.1009819.ref018) 2021; 21
C. Mallows (pcbi.1009819.ref020) 1998; 52
D Cook (pcbi.1009819.ref012) 2021
M Vandemeulebroecke (pcbi.1009819.ref023) 2019; 8
RE Kass (pcbi.1009819.ref014) 2016; 12
A Richter (pcbi.1009819.ref017)
KJ Lee (pcbi.1009819.ref024) 2021; 134
C. Chatfield (pcbi.1009819.ref004) 1985
M Huebner (pcbi.1009819.ref013) 2020; 20
B. Shneiderman (pcbi.1009819.ref022); 1996
pcbi.1009819.ref011
JM Wicherts (pcbi.1009819.ref008) 2016; 7
The Economist. (pcbi.1009819.ref001) 2017
M Huebner (pcbi.1009819.ref007) 2018; 4
G Wilson (pcbi.1009819.ref015) 2017; 13
GK Sandve (pcbi.1009819.ref016) 2013; 9
NL Kerr (pcbi.1009819.ref021) 1998; 2
DG Altman (pcbi.1009819.ref006) 1994; 308
C. Chatfield (pcbi.1009819.ref010) 1991; 6
JT Leek (pcbi.1009819.ref003) 2015; 520
MD Wilkinson (pcbi.1009819.ref019) 2016; 3
References_xml – volume: 13
  start-page: e1005510
  year: 2017
  ident: pcbi.1009819.ref015
  article-title: Good enough practices in scientific computing
  publication-title: PLoS Comput Biol
  doi: 10.1371/journal.pcbi.1005510
– volume: 3
  start-page: 160018
  year: 2016
  ident: pcbi.1009819.ref019
  article-title: The FAIR Guiding Principles for scientific data management and stewardship.
  publication-title: Sci Data.
  doi: 10.1038/sdata.2016.18
– volume: 134
  start-page: 79
  year: 2021
  ident: pcbi.1009819.ref024
  article-title: Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework.
  publication-title: J Clin Epidemiol.
  doi: 10.1016/j.jclinepi.2021.01.008
– volume: 6
  start-page: 240
  issue: 3
  year: 1991
  ident: pcbi.1009819.ref010
  article-title: Avoiding Statistical Pitfalls.
  publication-title: Statist Sci
– volume: 9
  start-page: e1003285
  year: 2013
  ident: pcbi.1009819.ref016
  article-title: Ten simple rules for reproducible computational research.
  publication-title: PLoS Comput Biol
  doi: 10.1371/journal.pcbi.1003285
– volume: 1996
  start-page: 336
  ident: pcbi.1009819.ref022
  article-title: The eyes have it: a task by data type taxonomy for information visualizations.
  publication-title: Proceedings 1996 IEEE Symposium on Visual Languages.
– volume: 52
  start-page: 1
  year: 1998
  ident: pcbi.1009819.ref020
  article-title: The Zeroth Problem.
  publication-title: Am Stat
  doi: 10.1080/00031305.1998.10480528
– volume: 2
  start-page: 196
  year: 1998
  ident: pcbi.1009819.ref021
  article-title: HARKing: hypothesizing after the results are known.
  publication-title: Personal Soc Psychol Rev
  doi: 10.1207/s15327957pspr0203_4
– volume-title: Statistical Analysis with Missing Data
  year: 2019
  ident: pcbi.1009819.ref025
– volume: 90
  start-page: 70
  year: 2012
  ident: pcbi.1009819.ref002
  article-title: Data scientist.
  publication-title: Harv Bus Rev
– volume: 20
  start-page: 61
  year: 2020
  ident: pcbi.1009819.ref013
  article-title: Topic Group “Initial Data Analysis” of the STRATOS Initiative (STRengthening Analytical Thinking for Observational Studies, http://www.stratos-initiative.org). Hidden analyses: a review of reporting practice and recommendations for more transparent reporting of initial data analyses.
  publication-title: BMC Med Res Methodol
  doi: 10.1186/s12874-020-00942-y
– ident: pcbi.1009819.ref017
  article-title: Data quality monitoring in clinical and observational epidemiologic studies: the role of metadata and process information
  publication-title: Management von Datenqualität in klinischen und beobachtenden epidemiologischen Studien: Die Rolle von Metadaten und Prozessinformationen
– volume: 308
  start-page: 283
  year: 1994
  ident: pcbi.1009819.ref006
  article-title: The scandal of poor medical research
  publication-title: BMJ
  doi: 10.1136/bmj.308.6924.283
– volume: 520
  start-page: 612
  year: 2015
  ident: pcbi.1009819.ref003
  article-title: Statistics: P values are just the tip of the iceberg
  publication-title: Nature
  doi: 10.1038/520612a
– volume: 4
  start-page: 171
  year: 2018
  ident: pcbi.1009819.ref007
  article-title: A contemporary conceptual framework for initial data analysis.
  publication-title: Obs Stud
  doi: 10.1353/obs.2018.0014
– volume: 8
  start-page: 24
  year: 2010
  ident: pcbi.1009819.ref026
  article-title: Transparent and accurate reporting increases reliability, utility, and impact of your research: reporting guidelines and the EQUATOR Network.
  publication-title: BMC Med.
  doi: 10.1186/1741-7015-8-24
– ident: pcbi.1009819.ref011
– volume: 21
  start-page: 63
  year: 2021
  ident: pcbi.1009819.ref018
  article-title: Facilitating harmonized data quality assessments. A data quality framework for observational health research data collections with software implementations in R.
  publication-title: BMC Med Res Methodol
  doi: 10.1186/s12874-021-01252-7
– year: 2021
  ident: pcbi.1009819.ref012
  article-title: The foundation is available for thinking about data visualization inferentially.
  publication-title: Harv Data Sci Rev.
– volume: 12
  start-page: e1004961
  year: 2016
  ident: pcbi.1009819.ref014
  article-title: Ten Simple Rules for Effective Statistical Practice.
  publication-title: PLoS Comput Biol.
  doi: 10.1371/journal.pcbi.1004961
– volume: 149
  start-page: 109
  year: 1986
  ident: pcbi.1009819.ref005
  article-title: Statistics, Science and Technology.
  publication-title: J R Stat Soc Ser A.
  doi: 10.2307/2981525
– volume: 21
  start-page: 1
  year: 2020
  ident: pcbi.1009819.ref009
  article-title: A hypothesis is a liability
  publication-title: Genome Biol
  doi: 10.1186/s13059-020-02133-w
– volume: 7
  start-page: 1832
  year: 2016
  ident: pcbi.1009819.ref008
  article-title: Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid p-Hacking.
  publication-title: Front Psychol.
  doi: 10.3389/fpsyg.2016.01832
– start-page: 214
  year: 1985
  ident: pcbi.1009819.ref004
  article-title: The Initial Examination of Data.
  publication-title: J R Stat Soc Ser A.
  doi: 10.2307/2981969
– volume-title: The world’s most valuable resource is no longer oil, but data.
  year: 2017
  ident: pcbi.1009819.ref001
– volume: 8
  start-page: 705
  year: 2019
  ident: pcbi.1009819.ref023
  article-title: Effective Visual Communication for the Quantitative Scientist.
  publication-title: CPT Pharmacometrics Syst Pharmacol.
  doi: 10.1002/psp4.12455
SSID ssj0035896
Score 2.5152423
SecondaryResourceType review_article
Snippet Typically, researchers do not perform IDA in a systematic way, if at all, or mix IDA activities with subsequent data analysis tasks such as hypothesis...
SourceID plos
doaj
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage e1009819
SubjectTerms Computer and Information Sciences
Data Analysis
Data management
Data mining
Decision analysis
Humans
Hypotheses
Information management
Laws, regulations and rules
Metadata
Methods
Ovarian Neoplasms
Physical Sciences
Planning
Reproducibility
Research and Analysis Methods
Researchers
Science Policy
Social Sciences
Statistical analysis
Subject specialists
SummonAdditionalLinks – databaseName: DOAJ - Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3da9RAEF_koOCL2PrRaJUogiDEJtmP7D5WsVTBPmgL97bsV_TgyB3N3UP_e2c2e-Eilb74ujt5mN9OdmbYmd8Q8s44G0xDfeG4tAXjXGARAIUDadtAWydsJKv-fikurtm3OZ_vjfrCmrCBHngA7rQKTpRtA5G8V6w2VlbCWtNY5hyssti6V6pyl0wNdzDlMk7mwqE4RUPZPDXN0aY6TWf0ce3sAmsElESWnT2nFLn7xxt6tl6u-rvCz7-rKPfc0vlj8ijFk_nZoMcheRC6I3IwTJi8fUI-XIUu7xdIAZzfbJehzyFIzRdYMQRfYXlobhItyVNyff7l6vNFkcYjFE5QuilaI4SChMm03ijhhGqc95Y2kpeV5Ry7MxR6Z-EVZb7ysqKOQzhhmXFKhZI-I7Nu1YVjkqsaDixQZwJF_i0lJTfeMi8ceHQRbEboDh_tEnc4jrBY6vgg1kAOMairEVWdUM1IMX61Hrgz7pH_hNCPssh8HRfAHnSyB32fPWTkLR6cRm6LDotnfplt3-uvPy_1mVCcASi1_KfQj4nQ-yTUrkBZZ1LDAkCGnFkTyWO0kp1SvYYoE5uSIfnOyMnOcu7efjNuw3-NjzWmC6stylAqGeSboNHzwdBGYCBojj3JGWkmJjhBbrrTLX5H7nDwKXAN1y_-B9QvycMam0FiDfsJmW1utuEVhGgb-zr-jX8At1E3Lw
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV3da9RAEF_qFcEX8btXq0QRBCH2kv3I7oNIKy1V8JDawr2F_Uo9OJLr5e7B_96ZzSb2pOprdjZkZ2fnIzvzG0LeaGu8LqhLLZcmZZwLTAKgsCFV5WllhQlg1V-n4uySfZnx2Q6Z9rUwmFbZ68SgqF1j8R_5IRhurPOEeObj8jrFrlF4u9q30NCxtYL7ECDG7pBdUMkS5H73-GT67bzXzZTL0LELm-WkBWWzWExHi-ww7t37pTVzzB1QEtF3bhirgOk_aO7RctG0t7mlf2ZX3jBXpw_I_ehnJkedYDwkO75-RO52nSd_PibvLnydtHOEBk5Wm4VvE3BekzlmEsEsTBtNdIQreUIuT08uPp2lsW1CagWl67TSQigIpHTltBJWqMI6Z2gh-SQznGPVhkKrLZyizGVOZtRycDMM01YpP6FPyahuar9HEpXDRnpqtaeIy6Wk5NoZ5oQFSy-8GRPa86e0EVMcW1ssynBRVkBs0S23RK6Wkatjkg6zlh2mxn_oj5H1Ay0iYocHzeqqjAeszLwVk6qAiM8plmsjM2GMLgyzFp4yeMlr3LgSMS9qTKq50pu2LT9_n5ZHQnEGTMnlX4nOt4jeRqKqgcVaHQsZgGWIpbVFuYdS0i-qLX8L8Zgc9JJz-_CrYRjOO17i6No3G6ShVDKIQ2FFzzpBGxgDznSoVR6TYksEtzi3PVLPfwRMcbA1oJ7z_X9_1nNyL8fyj5C1fkBG69XGvwCnbG1expP2C9fDNL4
  priority: 102
  providerName: ProQuest
Title Ten simple rules for initial data analysis
URI https://www.ncbi.nlm.nih.gov/pubmed/35202399
https://www.proquest.com/docview/2640120254
https://www.proquest.com/docview/2633848629
https://pubmed.ncbi.nlm.nih.gov/PMC8870512
https://doaj.org/article/1ec60f7014d942ab816bba7b4ccf7049
http://dx.doi.org/10.1371/journal.pcbi.1009819
Volume 18
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3di9NAEF_OHoIv4vf1PEsUQRByNNnvB5FWr57CFTmv0Lewu9mchZL2mha8_96ZbRqM9NCXBLKzgZmdzcxkZ35DyFvjrDeS5rHjysaMc4FJABQWpCg8LZywAaz6YizOJ-zblE8PyK5nay3Aam9oh_2kJqv56a-b24-w4T-Erg0y2U06XTo7w1N_rRAH9BBsk8Bw7II15wqUq9CxC5vlxJKyaV1Md9dbWsYqYPo3X-7Ocr6o9rmlf2dX_mGuRo_Iw9rPjAZbxXhMDnz5hNzfdp68fUreX_kyqmYIDRytNnNfReC8RjPMJIJZmDYamRqu5BmZjM6uPp3HdduE2AlK13FhhNAQSJkiN1o4oaXLc0ul4v3Eco5VGxqttsg1ZXmSq4Q6Dm6GZcZp7fv0OemUi9IfkUinsJCeOuMp4nJppbjJLcuFA0svvO0SupNP5mpMcWxtMc_CQZmE2GLLboZSzWqpdknczFpuMTX-QT9E0Te0iIgdHixW11m9wbLEO9EvJER8uWapsSoR1hppmXPwlMFL3uDCZYh5UWJSzbXZVFX29cc4GwjNGQglVXcSXbaI3tVExQKYdaYuZACRIZZWi_IItWTHVJWB94nFyhCUd8nJTnP2D79uhmG_4yGOKf1igzSUKgZxKHD0YqtojWDAmQ61yl0iWyrYklx7pJz9DJjiYGvg85we_zdvL8mDFCtBQgL7CemsVxv_Cvyzte2Re3Iq4apGX3rkcDD8PBzBfXg2_n7ZC_88emFT_gY7xz4L
linkProvider Scholars Portal
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Zb9NAEF6VIAQviLspBQwCISGZxt7D3geEyhES2uYBUilvy15uI0V2iBOh_il-IzM-QoMKPPXVO7ays3NmZ74h5Lm2xuuEutDy1ISMc4FFABQOJMs8zawwFVj10UgMjtnnCZ9skZ9tLwyWVbY2sTLUrrD4H_keOG7s84R85u38e4hTo_B2tR2hUYvFgT_7ASlb-Wb4Ac73RRz3P47fD8JmqkBoBaXLMNNCSMgzdOa0FFbIxDpnaJLyXmQ4x6YGiU5NOEmZi1waUcvBCxumrZS-R-G7V8hVBrkdalHa_9RafsrTah4YjuIJE8omTaseTaK9RjJez62ZYmWCTBHb55wrrCYGrP1CZz4ryouC3j9rN885w_4tcrOJYoP9Wuxuky2f3yHX6rmWZ3fJq7HPg3KKwMPBYjXzZQChcTDFOiV4C4tSA92Aodwjx5fCvvukkxe53yaBjEFMPLXaU0T9kmnKtTPMCQtxhPCmS2jLH2UbxHIcnDFT1TVcAplLvV2FXFUNV7skXL81rxE7_kP_Dlm_pkW87epBsThRjfqqyFvRyxLIJ51ksTZpJIzRiWHWwlMGH3mGB6cQUSPHkp0TvSpLNfw6UvtCcgZMidO_En3ZIHrZEGUFbNbqpk0CWIZIXRuU2ygl7aZK9VtFumS3lZyLl5-ul8Ga4BWRzn2xQhoQawZZLuzoQS1oa8ZAqF51QndJsiGCG5zbXMmnpxViOXgyMP7xzr9_1hNyfTA-OlSHw9HBQ3IjxkaTqj5-l3SWi5V_BOHf0jyudC4g3y5byX8BZmdpfw
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELdGEYgXxPcKAwICISGFNrHj2A8IDUZZGVQINqlvxl8ZlaqkNK3Q_jX-Ou7yURY04Gmv8TmKz3e-u_jud4Q80dZ4nVIX2kSYkCUJxyQAChuSZZ5mlpsKrPrjhO8fsffTZLpFfra1MJhW2Z6J1UHtCov_yAdguLHOE-KZQdakRXzaG71afA-xgxTetLbtNGoROfAnPyB8K1-O92Cvn8bx6O3hm_2w6TAQWk7pKsw05xJiDp05LbnlMrXOGZqKZBiZJMECB4kGjjtJmYuciKhNwCIbpq2UfkjhvRfIxZSmErVLjN61VoAmouoNhm15wpSyaVO2R9No0EjJi4U1M8xSkAJxfk6Zxap7wMZG9BbzojzLAf4zj_OUYRxdI1cbjzbYrUXwOtny-Q1yqe5xeXKTPD_0eVDOEIQ4WK7nvgzATQ5mmLMEszBBNdANMMotcnQu7LtNenmR-20SyBhExlOrPUUEMClEop1hjlvwKbg3fUJb_ijboJdjE425qq7kUohi6uUq5KpquNon4WbWokbv-A_9a2T9hhaxt6sHxfJYNaqsIm_5MEshtnSSxdqIiBujU8OshacMXvIYN04hukaOcnqs12Wpxl8mapfLhAFTYvFXos8domcNUVbAYq1uSiaAZYja1aHcRilpF1Wq3-rSJzut5Jw9_GgzDCcLXhfp3BdrpKFUMIh4YUV3akHbMAbc9qoquk_Sjgh2ONcdyWffKvRysGpgCOK7__6sh-QyqLf6MJ4c3CNXYqw5qVLld0hvtVz7--AJrsyDSuUC8vW8dfwXe-FttQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Ten+simple+rules+for+initial+data+analysis&rft.jtitle=PLoS+computational+biology&rft.au=Baillie%2C+Mark&rft.au=le+Cessie%2C+Saskia&rft.au=Schmidt%2C+Carsten+Oliver&rft.au=Lusa%2C+Lara&rft.date=2022-02-01&rft.pub=Public+Library+of+Science&rft.issn=1553-734X&rft.volume=18&rft.issue=2&rft_id=info:doi/10.1371%2Fjournal.pcbi.1009819&rft.externalDocID=A695460928
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1553-7358&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1553-7358&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1553-7358&client=summon