Separating Biological Variance from Noise by Applying Expectation–Maximization Algorithm to Modified General Linear Model

The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological varia...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational biology
Main Author Lee, Tien-Wen
Format Journal Article
LanguageEnglish
Published United States Mary Ann Liebert, Inc., publishers 05.09.2025
Subjects
Online AccessGet full text
ISSN1557-8666
1557-8666
DOI10.1177/15578666251370766

Cover

Abstract The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%–16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed.
AbstractList The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%-16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed.The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%-16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed.
The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%–16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed.
Author Lee, Tien-Wen
Author_xml – sequence: 1
  givenname: Tien-Wen
  orcidid: 0000-0001-5707-026X
  surname: Lee
  fullname: Lee, Tien-Wen
BackLink https://www.ncbi.nlm.nih.gov/pubmed/40911519$$D View this record in MEDLINE/PubMed
BookMark eNqNkc9O3DAQxi20iF1oH6AX5COXbf0nsePjgoAiLXDon2vkOJOtK8dO7azEwoV34A15EpLuUiH1wmE0M59-33eYOUQTHzwg9ImSz5RK-YXmuSyEECynXBIpxB6ajdp8FCdv5ik6TOk3IZQLIg_QNCOK0pyqGXr4Bp2Ourd-hU9tcGFljXb4p45WewO4iaHFN8EmwNUGL7rObUb0_K4D0w-24J8fn671nW3t_d8VL9wqRNv_anEf8HWobWOhxpfgIQ7BS-tBx1EH9wHtN9ol-LjrR-jHxfn3s6_z5e3l1dliOTeM035uOKsMa6qiyVRWEV7lTAEY0xQyg5oVqpY1F2aoRhWacE0LyjIjaiWpyYHxI3Syze1i-LOG1JetTQac0x7COpWcZTKTuVJiQI936LpqoS67aFsdN-XrwQaAbgETQ0oRmn8IJeX4lPK_pwweufWMnPbeWagg9u9wvgA04ZHW
Cites_doi 10.1111/j.2517-6161.1977.tb01600.x
10.1191/1471082X05st097oa
10.1523/jneurosci.5641-10.2011
10.1002/9781119541219
10.1016/j.jspi.2023.03.002
10.1006/nimg.1996.0016
10.2307/2529876
10.1038/nn.2501
10.1016/j.ijcard.2009.09.543
10.3389/fnagi.2018.00039
10.1038/s41562-017-0189-z
10.1186/1471-2202-12-121
10.1080/03640210701801941
ContentType Journal Article
Copyright 2025, Mary Ann Liebert, Inc., publishers
Copyright_xml – notice: 2025, Mary Ann Liebert, Inc., publishers
DBID AAYXX
CITATION
NPM
7X8
DOI 10.1177/15578666251370766
DatabaseName CrossRef
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic

PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
Mathematics
EISSN 1557-8666
ExternalDocumentID 40911519
10_1177_15578666251370766
Genre Journal Article
GroupedDBID ---
0R~
29K
4.4
53G
5GY
ABBKN
ACGFO
ADBBV
AENEX
ALMA_UNASSIGNED_HOLDINGS
BAWUL
BNQNF
CS3
D-I
DIK
DU5
EBS
F5P
IAO
IHR
IM4
MV1
NQHIM
O9-
P2P
RML
RNS
SCNPE
TN5
TR2
UE5
AAYXX
CITATION
NPM
7X8
ID FETCH-LOGICAL-c231t-c32bc2fb8f494b03b529eeccf874ed289d7d36cd36f98a03a18124c6d971c5e23
ISSN 1557-8666
IngestDate Sat Sep 06 17:30:45 EDT 2025
Tue Sep 09 02:31:43 EDT 2025
Thu Oct 09 00:27:31 EDT 2025
Sat Sep 06 11:25:59 EDT 2025
IsPeerReviewed true
IsScholarly true
Keywords expectation–maximization algorithm
design matrix
global optimum
general linear model
local optimum
Language English
License https://www.liebertpub.com/nv/resources-tools/text-and-data-mining-policy/121
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c231t-c32bc2fb8f494b03b529eeccf874ed289d7d36cd36f98a03a18124c6d971c5e23
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0001-5707-026X
PMID 40911519
PQID 3247475996
PQPubID 23479
ParticipantIDs proquest_miscellaneous_3247475996
pubmed_primary_40911519
crossref_primary_10_1177_15578666251370766
maryannliebert_primary_10_1177_15578666251370766
PublicationCentury 2000
PublicationDate 20250905
2025-09-05
2025-Sep-05
PublicationDateYYYYMMDD 2025-09-05
PublicationDate_xml – month: 09
  year: 2025
  text: 20250905
  day: 05
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Journal of computational biology
PublicationTitleAlternate J Comput Biol
PublicationYear 2025
Publisher Mary Ann Liebert, Inc., publishers
Publisher_xml – name: Mary Ann Liebert, Inc., publishers
References Oliveira DC (B13) 2025; 11
B12
B15
B16
McCullagh P (B10) 2019
B17
Wold H (B18) 1966
McCulloch CE (B11) 2001
B1
B2
B3
B4
B5
B6
B7
B8
B9
Pascual-Marqui RD (B14) 2007
References_xml – ident: B4
  doi: 10.1111/j.2517-6161.1977.tb01600.x
– ident: B17
  doi: 10.1191/1471082X05st097oa
– ident: B5
  doi: 10.1523/jneurosci.5641-10.2011
– ident: B9
  doi: 10.1002/9781119541219
– ident: B6
  doi: 10.1016/j.jspi.2023.03.002
– ident: B12
  doi: 10.1006/nimg.1996.0016
– ident: B7
  doi: 10.2307/2529876
– ident: B3
  doi: 10.1038/nn.2501
– volume-title: Generalized, Linear, and Mixed Models
  year: 2001
  ident: B11
– year: 2019
  ident: B10
  publication-title: Routledge
– ident: B16
  doi: 10.1016/j.ijcard.2009.09.543
– ident: B15
  doi: 10.3389/fnagi.2018.00039
– ident: B1
  doi: 10.1038/s41562-017-0189-z
– year: 2007
  ident: B14
  publication-title: arXiv Preprint arXiv
– ident: B8
  doi: 10.1186/1471-2202-12-121
– volume: 11
  start-page: 1
  issue: 1
  year: 2025
  ident: B13
  publication-title: Proc Series of the SBMAC
– start-page: 391
  year: 1966
  ident: B18
  publication-title: Multivariate Analysis
– ident: B2
  doi: 10.1080/03640210701801941
SSID ssj0013607
Score 2.445834
SecondaryResourceType online_first
Snippet The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in...
SourceID proquest
pubmed
crossref
maryannliebert
SourceType Aggregation Database
Index Database
Publisher
Title Separating Biological Variance from Noise by Applying Expectation–Maximization Algorithm to Modified General Linear Model
URI https://www.liebertpub.com/doi/abs/10.1177/15578666251370766
https://www.ncbi.nlm.nih.gov/pubmed/40911519
https://www.proquest.com/docview/3247475996
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 1557-8666
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0013607
  issn: 1557-8666
  databaseCode: DIK
  dateStart: 20241105
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fb9MwELZgCIlJTDBgbANkJMQDyChN_KN5nGDTgLU8kELfojixoRJLxppJK389d46TMLpKhYdGleWkle_L5buc7ztCXsgiklIUkqHzZzyznMW4E0AOtI6H0kbCut0WY3k84R-mYto3vHPVJbV-k_-6tq7kf6wKY2BXrJL9B8t2F4UB-A72hSNYGI5r2fizaZS7Idpvekq6Ff8C4a8rBHClI-NqNjdIMpFvupomVDfOmww8G2WXs1Nfivn64Me36nxWfz9FQjqqiplFfuqFqTF6d6I_2DpnBaXNXYuI9vWi13f6a8tPAr6EffUFaP51Qyjcfirxp4cU8FiT0utXL48t-2SXFcY5OAX4VKQCJa_Rvx5_So8mJydpcjhNXp79ZNgaDFPovk_KTXIrBNeN_Tnevf_Yp4qkq4nv_oNPXaOq1tJvXiEfd7E2MCtL4Pq4h311iOGoRnKPbPkFpQeNwe-TG6bcJrcbCy-2yeaok9qdPyCLHgS0BwFtQUARBNSBgOoFbUFAV4GAdiCgdUVbEFAPAtqAgDoQPCSTo8Pk7THz_TRYDiy-ZnkU6jy0emh5zHUQaRHGBm5hO1TcFBB5Fwru3Bw-Nh5mQZQ59pfLIlaDXJgwekQ2yqo0jwkdRNoGobJhaCznSmdWaMEjKwsFD4hM7ZJX7UKnZ41sSjrwyvJLVtklwVVTrHPK89ZYKfhDTHJlpaku5ikECAo1LGOYs9NYsbscB3IMDDfeW-PsfXKnh_8TslGfX5inwD9r_czh7zf-R4XT
linkProvider Flying Publisher
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Separating+Biological+Variance+from+Noise+by+Applying+Expectation-Maximization+Algorithm+to+Modified+General+Linear+Model&rft.jtitle=Journal+of+computational+biology&rft.au=Lee%2C+Tien-Wen&rft.date=2025-09-05&rft.issn=1557-8666&rft.eissn=1557-8666&rft_id=info:doi/10.1177%2F15578666251370766&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1557-8666&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1557-8666&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1557-8666&client=summon