Separating Biological Variance from Noise by Applying Expectation–Maximization Algorithm to Modified General Linear Model
The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological varia...
Saved in:
| Published in | Journal of computational biology |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | English |
| Published |
United States
Mary Ann Liebert, Inc., publishers
05.09.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1557-8666 1557-8666 |
| DOI | 10.1177/15578666251370766 |
Cover
| Abstract | The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%–16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed. |
|---|---|
| AbstractList | The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%-16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed.The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%-16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed. The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%–16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed. |
| Author | Lee, Tien-Wen |
| Author_xml | – sequence: 1 givenname: Tien-Wen orcidid: 0000-0001-5707-026X surname: Lee fullname: Lee, Tien-Wen |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/40911519$$D View this record in MEDLINE/PubMed |
| BookMark | eNqNkc9O3DAQxi20iF1oH6AX5COXbf0nsePjgoAiLXDon2vkOJOtK8dO7azEwoV34A15EpLuUiH1wmE0M59-33eYOUQTHzwg9ImSz5RK-YXmuSyEECynXBIpxB6ajdp8FCdv5ik6TOk3IZQLIg_QNCOK0pyqGXr4Bp2Ourd-hU9tcGFljXb4p45WewO4iaHFN8EmwNUGL7rObUb0_K4D0w-24J8fn671nW3t_d8VL9wqRNv_anEf8HWobWOhxpfgIQ7BS-tBx1EH9wHtN9ol-LjrR-jHxfn3s6_z5e3l1dliOTeM035uOKsMa6qiyVRWEV7lTAEY0xQyg5oVqpY1F2aoRhWacE0LyjIjaiWpyYHxI3Syze1i-LOG1JetTQac0x7COpWcZTKTuVJiQI936LpqoS67aFsdN-XrwQaAbgETQ0oRmn8IJeX4lPK_pwweufWMnPbeWagg9u9wvgA04ZHW |
| Cites_doi | 10.1111/j.2517-6161.1977.tb01600.x 10.1191/1471082X05st097oa 10.1523/jneurosci.5641-10.2011 10.1002/9781119541219 10.1016/j.jspi.2023.03.002 10.1006/nimg.1996.0016 10.2307/2529876 10.1038/nn.2501 10.1016/j.ijcard.2009.09.543 10.3389/fnagi.2018.00039 10.1038/s41562-017-0189-z 10.1186/1471-2202-12-121 10.1080/03640210701801941 |
| ContentType | Journal Article |
| Copyright | 2025, Mary Ann Liebert, Inc., publishers |
| Copyright_xml | – notice: 2025, Mary Ann Liebert, Inc., publishers |
| DBID | AAYXX CITATION NPM 7X8 |
| DOI | 10.1177/15578666251370766 |
| DatabaseName | CrossRef PubMed MEDLINE - Academic |
| DatabaseTitle | CrossRef PubMed MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic PubMed |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology Mathematics |
| EISSN | 1557-8666 |
| ExternalDocumentID | 40911519 10_1177_15578666251370766 |
| Genre | Journal Article |
| GroupedDBID | --- 0R~ 29K 4.4 53G 5GY ABBKN ACGFO ADBBV AENEX ALMA_UNASSIGNED_HOLDINGS BAWUL BNQNF CS3 D-I DIK DU5 EBS F5P IAO IHR IM4 MV1 NQHIM O9- P2P RML RNS SCNPE TN5 TR2 UE5 AAYXX CITATION NPM 7X8 |
| ID | FETCH-LOGICAL-c231t-c32bc2fb8f494b03b529eeccf874ed289d7d36cd36f98a03a18124c6d971c5e23 |
| ISSN | 1557-8666 |
| IngestDate | Sat Sep 06 17:30:45 EDT 2025 Tue Sep 09 02:31:43 EDT 2025 Thu Oct 09 00:27:31 EDT 2025 Sat Sep 06 11:25:59 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | expectation–maximization algorithm design matrix global optimum general linear model local optimum |
| Language | English |
| License | https://www.liebertpub.com/nv/resources-tools/text-and-data-mining-policy/121 |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c231t-c32bc2fb8f494b03b529eeccf874ed289d7d36cd36f98a03a18124c6d971c5e23 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ORCID | 0000-0001-5707-026X |
| PMID | 40911519 |
| PQID | 3247475996 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_3247475996 pubmed_primary_40911519 crossref_primary_10_1177_15578666251370766 maryannliebert_primary_10_1177_15578666251370766 |
| PublicationCentury | 2000 |
| PublicationDate | 20250905 2025-09-05 2025-Sep-05 |
| PublicationDateYYYYMMDD | 2025-09-05 |
| PublicationDate_xml | – month: 09 year: 2025 text: 20250905 day: 05 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Journal of computational biology |
| PublicationTitleAlternate | J Comput Biol |
| PublicationYear | 2025 |
| Publisher | Mary Ann Liebert, Inc., publishers |
| Publisher_xml | – name: Mary Ann Liebert, Inc., publishers |
| References | Oliveira DC (B13) 2025; 11 B12 B15 B16 McCullagh P (B10) 2019 B17 Wold H (B18) 1966 McCulloch CE (B11) 2001 B1 B2 B3 B4 B5 B6 B7 B8 B9 Pascual-Marqui RD (B14) 2007 |
| References_xml | – ident: B4 doi: 10.1111/j.2517-6161.1977.tb01600.x – ident: B17 doi: 10.1191/1471082X05st097oa – ident: B5 doi: 10.1523/jneurosci.5641-10.2011 – ident: B9 doi: 10.1002/9781119541219 – ident: B6 doi: 10.1016/j.jspi.2023.03.002 – ident: B12 doi: 10.1006/nimg.1996.0016 – ident: B7 doi: 10.2307/2529876 – ident: B3 doi: 10.1038/nn.2501 – volume-title: Generalized, Linear, and Mixed Models year: 2001 ident: B11 – year: 2019 ident: B10 publication-title: Routledge – ident: B16 doi: 10.1016/j.ijcard.2009.09.543 – ident: B15 doi: 10.3389/fnagi.2018.00039 – ident: B1 doi: 10.1038/s41562-017-0189-z – year: 2007 ident: B14 publication-title: arXiv Preprint arXiv – ident: B8 doi: 10.1186/1471-2202-12-121 – volume: 11 start-page: 1 issue: 1 year: 2025 ident: B13 publication-title: Proc Series of the SBMAC – start-page: 391 year: 1966 ident: B18 publication-title: Multivariate Analysis – ident: B2 doi: 10.1080/03640210701801941 |
| SSID | ssj0013607 |
| Score | 2.445834 |
| SecondaryResourceType | online_first |
| Snippet | The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in... |
| SourceID | proquest pubmed crossref maryannliebert |
| SourceType | Aggregation Database Index Database Publisher |
| Title | Separating Biological Variance from Noise by Applying Expectation–Maximization Algorithm to Modified General Linear Model |
| URI | https://www.liebertpub.com/doi/abs/10.1177/15578666251370766 https://www.ncbi.nlm.nih.gov/pubmed/40911519 https://www.proquest.com/docview/3247475996 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1557-8666 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0013607 issn: 1557-8666 databaseCode: DIK dateStart: 20241105 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fb9MwELZgCIlJTDBgbANkJMQDyChN_KN5nGDTgLU8kELfojixoRJLxppJK389d46TMLpKhYdGleWkle_L5buc7ztCXsgiklIUkqHzZzyznMW4E0AOtI6H0kbCut0WY3k84R-mYto3vHPVJbV-k_-6tq7kf6wKY2BXrJL9B8t2F4UB-A72hSNYGI5r2fizaZS7Idpvekq6Ff8C4a8rBHClI-NqNjdIMpFvupomVDfOmww8G2WXs1Nfivn64Me36nxWfz9FQjqqiplFfuqFqTF6d6I_2DpnBaXNXYuI9vWi13f6a8tPAr6EffUFaP51Qyjcfirxp4cU8FiT0utXL48t-2SXFcY5OAX4VKQCJa_Rvx5_So8mJydpcjhNXp79ZNgaDFPovk_KTXIrBNeN_Tnevf_Yp4qkq4nv_oNPXaOq1tJvXiEfd7E2MCtL4Pq4h311iOGoRnKPbPkFpQeNwe-TG6bcJrcbCy-2yeaok9qdPyCLHgS0BwFtQUARBNSBgOoFbUFAV4GAdiCgdUVbEFAPAtqAgDoQPCSTo8Pk7THz_TRYDiy-ZnkU6jy0emh5zHUQaRHGBm5hO1TcFBB5Fwru3Bw-Nh5mQZQ59pfLIlaDXJgwekQ2yqo0jwkdRNoGobJhaCznSmdWaMEjKwsFD4hM7ZJX7UKnZ41sSjrwyvJLVtklwVVTrHPK89ZYKfhDTHJlpaku5ikECAo1LGOYs9NYsbscB3IMDDfeW-PsfXKnh_8TslGfX5inwD9r_czh7zf-R4XT |
| linkProvider | Flying Publisher |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Separating+Biological+Variance+from+Noise+by+Applying+Expectation-Maximization+Algorithm+to+Modified+General+Linear+Model&rft.jtitle=Journal+of+computational+biology&rft.au=Lee%2C+Tien-Wen&rft.date=2025-09-05&rft.issn=1557-8666&rft.eissn=1557-8666&rft_id=info:doi/10.1177%2F15578666251370766&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1557-8666&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1557-8666&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1557-8666&client=summon |