High-dimensional disjoint factor analysis with its EM algorithm version

Vichi (Advances in Data Analysis and Classification, 11:563–591, 2017) proposed disjoint factor analysis (DFA), which is a factor analysis procedure subject to the constraint that variables are mutually disjoint. That is, in the DFA solution, each variable loads only a single factor among multiple o...

Full description

Saved in:
Bibliographic Details
Published inJapanese journal of statistics and data science Vol. 4; no. 1; pp. 427 - 448
Main Authors Cai, Jingyu, Adachi, Kohei
Format Journal Article
LanguageEnglish
Published Singapore Springer Singapore 01.07.2021
Subjects
Online AccessGet full text
ISSN2520-8756
2520-8764
DOI10.1007/s42081-021-00119-x

Cover

Abstract Vichi (Advances in Data Analysis and Classification, 11:563–591, 2017) proposed disjoint factor analysis (DFA), which is a factor analysis procedure subject to the constraint that variables are mutually disjoint. That is, in the DFA solution, each variable loads only a single factor among multiple ones. It implies that the variables are clustered into exclusive groups. Such variable clustering is considered useful for high-dimensional data with variables much more than observations. However, the feasibility of DFA for high-dimensional data has not been considered in Vichi (2017). Thus, one purpose of this paper is to show the feasibility and usefulness of DFA for high-dimensional data. Another purpose is to propose a new computational procedure for DFA, in which an EM algorithm is used. This procedure is called EM-DFA in particular, which can serve the same original purpose as in Vichi (2017) but more efficiently. Numerical studies demonstrate that both DFA and EM-DFA can cluster variables fairly well, with EM-DFA more computationally efficient.
AbstractList Vichi (Advances in Data Analysis and Classification, 11:563–591, 2017) proposed disjoint factor analysis (DFA), which is a factor analysis procedure subject to the constraint that variables are mutually disjoint. That is, in the DFA solution, each variable loads only a single factor among multiple ones. It implies that the variables are clustered into exclusive groups. Such variable clustering is considered useful for high-dimensional data with variables much more than observations. However, the feasibility of DFA for high-dimensional data has not been considered in Vichi (2017). Thus, one purpose of this paper is to show the feasibility and usefulness of DFA for high-dimensional data. Another purpose is to propose a new computational procedure for DFA, in which an EM algorithm is used. This procedure is called EM-DFA in particular, which can serve the same original purpose as in Vichi (2017) but more efficiently. Numerical studies demonstrate that both DFA and EM-DFA can cluster variables fairly well, with EM-DFA more computationally efficient.
Author Adachi, Kohei
Cai, Jingyu
Author_xml – sequence: 1
  givenname: Jingyu
  orcidid: 0000-0002-8726-3991
  surname: Cai
  fullname: Cai, Jingyu
  email: caijingyu10@yahoo.co.jp
  organization: Graduate School of Human Sciences, Osaka University
– sequence: 2
  givenname: Kohei
  surname: Adachi
  fullname: Adachi, Kohei
  organization: Graduate School of Human Sciences, Osaka University
BookMark eNp9kMFOwzAMhiM0JMbYC3DKCwSSNE3dI5rGNmmIC5yj0KRdpq1BcYHt7ckY4sjBsv3Lv2V_12TUx94Tciv4neC8ukclOQjGZQ4uRM0OF2QsS8kZVFqN_upSX5Ep4pZzLqtCVRLGZLEM3Ya5sPc9htjbHXUBtzH0A21tM8REbRaPGJB-hWFDw4B0_kTtrosp93v66dPJeEMuW7tDP_3NE_L6OH-ZLdn6ebGaPaxZI5UaGHjrnKz1m5Zga9CNV1oVXjkrnCwLK8CBzOcB8DbLWjoHvvEN1BW0RemKCZHnvU2KiMm35j2FvU1HI7g50TBnGibTMD80zCGbirMJ83Df-WS28SPlv_A_1zei_GVJ
Cites_doi 10.1007/s11634-017-0284-z
10.1007/s11336-017-9600-y
10.1007/s11336-012-9299-8
10.1007/BF02289162
10.1007/s11222-014-9458-0
10.1007/BF02289658
10.1016/j.csda.2008.05.028
10.1007/BF02293851
10.1007/s11634-016-0263-9
10.1007/s00180-015-0608-4
10.1002/9781119970583
10.1007/BF02294359
10.1177/001316446002000116
10.1016/j.csda.2016.01.012
10.1137/1.9780898718348
10.1093/bioinformatics/17.9.763
10.1002/wics.1458
10.1111/j.2517-6161.1977.tb01600.x
ContentType Journal Article
Copyright Japanese Federation of Statistical Science Associations 2021
Copyright_xml – notice: Japanese Federation of Statistical Science Associations 2021
DBID AAYXX
CITATION
DOI 10.1007/s42081-021-00119-x
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
Economics
Law
Statistics
Physics
Computer Science
EISSN 2520-8764
EndPage 448
ExternalDocumentID 10_1007_s42081_021_00119_x
GroupedDBID -EM
0R~
406
AACDK
AAHNG
AAIAL
AAJBT
AASML
AATNV
AAUYE
ABAKF
ABDZT
ABECU
ABFTV
ABJNI
ABKCH
ABMQK
ABQBU
ABTEG
ABTKH
ABTMW
ABXPI
ACAOD
ACDTI
ACGFS
ACHSB
ACMLO
ACOKC
ACPIV
ACZOJ
ADKNI
ADRFC
ADTPH
ADURQ
ADYFF
AEFQL
AEJRE
AEMSY
AESKC
AFBBN
AFQWF
AGDGC
AGJBK
AGMZJ
AGQEE
AGRTI
AIAKS
AIGIU
AILAN
AITGF
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMXSW
AMYLF
AXYYD
BGNMA
CSCUP
DPUIP
EBLON
EBS
EJD
FIGPU
FINBP
FNLPD
FSGXE
GGCAI
IKXTQ
IWAJR
J-C
JZLTJ
KOV
LLZTM
M4Y
NPVJJ
NQJWS
NU0
O9J
PT4
RLLFE
ROL
RSV
SJYHP
SNE
SNPRN
SOHCF
SOJ
SRMVM
SSLCW
STPWE
TSG
UOJIU
UTJUX
UZXMN
VFIZW
ZMTXR
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
AEZWR
AFDZB
AFHIU
AFOHR
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
ID FETCH-LOGICAL-c244t-8eadd296b628a986ce4643e4da1d253a18d82002880f3e462dd8ecec8978f35d3
ISSN 2520-8756
IngestDate Wed Oct 01 04:42:23 EDT 2025
Fri Feb 21 02:48:12 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords High-dimensional data
Variable clustering
EM algorithm
Disjoint factor analysis
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c244t-8eadd296b628a986ce4643e4da1d253a18d82002880f3e462dd8ecec8978f35d3
ORCID 0000-0002-8726-3991
PageCount 22
ParticipantIDs crossref_primary_10_1007_s42081_021_00119_x
springer_journals_10_1007_s42081_021_00119_x
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20210700
2021-07-00
PublicationDateYYYYMMDD 2021-07-01
PublicationDate_xml – month: 7
  year: 2021
  text: 20210700
PublicationDecade 2020
PublicationPlace Singapore
PublicationPlace_xml – name: Singapore
PublicationTitle Japanese journal of statistics and data science
PublicationTitleAbbrev Jpn J Stat Data Sci
PublicationYear 2021
Publisher Springer Singapore
Publisher_xml – name: Springer Singapore
References Akaike (CR7) 1987; 52
Dempster, Laird, Rubin (CR9) 1977; 39
Seber (CR19) 2008
Yanai, Ichikawa, Rao, Sinharay (CR23) 2007
CR3
Osgood, Suci, Tannenbaum (CR17) 1957
Vichi (CR21) 2017; 11
Adachi, Trendafilov (CR6) 2018; 83
Bartholomew, Knott, Moustaki (CR8) 2011
Gan, Ma, Wu (CR10) 2007
Rubin, Thayer (CR18) 1982; 47
Adachi, Trendafilov (CR5) 2018; 12
Kaiser (CR14) 1960; 20
Koch (CR15) 2014
Adachi, Trendafilov (CR4) 2016; 31
Guttman (CR11) 1954; 19
Adachi, Sakata (CR2) 2016
Hirose, Yamamoto (CR12) 2015; 25
Jöreskog (CR13) 1967; 32
Adachi (CR1) 2013; 78
Stegeman (CR20) 2016; 99
Vichi, Saporta (CR22) 2009; 53
Yeung, Ruzzo (CR24) 2001; 17
Konishi, Kitagawa (CR16) 2007
119_CR3
A Stegeman (119_CR20) 2016; 99
KY Yeung (119_CR24) 2001; 17
K Adachi (119_CR1) 2013; 78
K Hirose (119_CR12) 2015; 25
M Vichi (119_CR22) 2009; 53
H Akaike (119_CR7) 1987; 52
H Yanai (119_CR23) 2007
K Adachi (119_CR6) 2018; 83
HF Kaiser (119_CR14) 1960; 20
D Bartholomew (119_CR8) 2011
S Konishi (119_CR16) 2007
G Gan (119_CR10) 2007
DB Rubin (119_CR18) 1982; 47
K Adachi (119_CR5) 2018; 12
I Koch (119_CR15) 2014
K Adachi (119_CR2) 2016
GAF Seber (119_CR19) 2008
AP Dempster (119_CR9) 1977; 39
L Guttman (119_CR11) 1954; 19
M Vichi (119_CR21) 2017; 11
K Adachi (119_CR4) 2016; 31
CE Osgood (119_CR17) 1957
KG Jöreskog (119_CR13) 1967; 32
References_xml – start-page: 257
  year: 2007
  end-page: 296
  ident: CR23
  article-title: Factor analysis
  publication-title: Handbook of statistics, vol. 26: Psychometrics
– volume: 12
  start-page: 559
  year: 2018
  end-page: 585
  ident: CR5
  article-title: Sparsest factor analysis for clustering variables: A matrix decomposition approach
  publication-title: Advances in Data Analysis and Classification
  doi: 10.1007/s11634-017-0284-z
– year: 2007
  ident: CR16
  publication-title: Information criteria and statistical modeling
– year: 1957
  ident: CR17
  publication-title: The measurement of meaning
– year: 2008
  ident: CR19
  publication-title: A matrix handbook for statisticians
– ident: CR3
– volume: 83
  start-page: 407
  year: 2018
  end-page: 424
  ident: CR6
  article-title: Some mathematical properties of the matrix decomposition solution in factor analysis
  publication-title: Psychometrika
  doi: 10.1007/s11336-017-9600-y
– year: 2014
  ident: CR15
  publication-title: Analysis of multivariate and high-dimensional data
– volume: 78
  start-page: 380
  year: 2013
  end-page: 394
  ident: CR1
  article-title: Factor analysis with EM algorithm never gives improper solutions when sample covariance and initial parameter matrices are proper
  publication-title: Psychometrika
  doi: 10.1007/s11336-012-9299-8
– volume: 19
  start-page: 149
  year: 1954
  end-page: 160
  ident: CR11
  article-title: Some necessary conditions for common-factor analysis
  publication-title: Psychometrika
  doi: 10.1007/BF02289162
– volume: 25
  start-page: 863
  year: 2015
  end-page: 875
  ident: CR12
  article-title: Sparse estimation via nonconcave penalized likelihood in factor analysis model
  publication-title: Statistics and Computing
  doi: 10.1007/s11222-014-9458-0
– volume: 32
  start-page: 443
  year: 1967
  end-page: 482
  ident: CR13
  article-title: Some contributions to maximum likelihood factor analysis
  publication-title: Psychometrika
  doi: 10.1007/BF02289658
– volume: 39
  start-page: 1
  year: 1977
  end-page: 38
  ident: CR9
  article-title: Maximum likelihood from incomplete data via the EM algorithm
  publication-title: Journal of the Royal Statistical Society: Series B
– volume: 53
  start-page: 3194
  year: 2009
  end-page: 3208
  ident: CR22
  article-title: Clustering and disjoint principal component analysis with cross-loadings
  publication-title: Computational Statistics & Data Analysis
  doi: 10.1016/j.csda.2008.05.028
– start-page: 1
  year: 2016
  end-page: 21
  ident: CR2
  article-title: Three-way principal component analysis with its applications to psychology
  publication-title: Applied matrix and tensor variate data analysis
– volume: 47
  start-page: 69
  year: 1982
  end-page: 76
  ident: CR18
  article-title: EM algorithms for ML factor analysis
  publication-title: Psychometrika
  doi: 10.1007/BF02293851
– volume: 11
  start-page: 563
  year: 2017
  end-page: 591
  ident: CR21
  article-title: Disjoint factor analysis with cross-loadings
  publication-title: Advances in Data Analysis and Classification
  doi: 10.1007/s11634-016-0263-9
– volume: 31
  start-page: 1403
  year: 2016
  end-page: 1427
  ident: CR4
  article-title: Sparse principal component analysis subject to prespecified cardinality of loadings
  publication-title: Computational Statistics
  doi: 10.1007/s00180-015-0608-4
– year: 2011
  ident: CR8
  publication-title: Latent variable models and factor analysis: A unified approach (Third Edition)
  doi: 10.1002/9781119970583
– volume: 52
  start-page: 317
  year: 1987
  end-page: 332
  ident: CR7
  article-title: Factor analysis and AIC
  publication-title: Psychometrika
  doi: 10.1007/BF02294359
– volume: 20
  start-page: 141
  year: 1960
  end-page: 151
  ident: CR14
  article-title: The application of electronic computers to factor analysis
  publication-title: Educational and Psychological Measurements
  doi: 10.1177/001316446002000116
– volume: 99
  start-page: 189
  year: 2016
  end-page: 203
  ident: CR20
  article-title: A new method for simultaneous estimation of the factor model parameters, factor scores, and unique parts
  publication-title: Computational Statistics & Data Analysis
  doi: 10.1016/j.csda.2016.01.012
– year: 2007
  ident: CR10
  publication-title: Data clustering: Theory, algorithms, and applications
  doi: 10.1137/1.9780898718348
– volume: 17
  start-page: 763
  year: 2001
  end-page: 774
  ident: CR24
  article-title: Principal component analysis for clustering gene expression data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/17.9.763
– volume: 12
  start-page: 559
  year: 2018
  ident: 119_CR5
  publication-title: Advances in Data Analysis and Classification
  doi: 10.1007/s11634-017-0284-z
– volume-title: Information criteria and statistical modeling
  year: 2007
  ident: 119_CR16
– volume: 31
  start-page: 1403
  year: 2016
  ident: 119_CR4
  publication-title: Computational Statistics
  doi: 10.1007/s00180-015-0608-4
– volume: 52
  start-page: 317
  year: 1987
  ident: 119_CR7
  publication-title: Psychometrika
  doi: 10.1007/BF02294359
– volume: 78
  start-page: 380
  year: 2013
  ident: 119_CR1
  publication-title: Psychometrika
  doi: 10.1007/s11336-012-9299-8
– volume: 83
  start-page: 407
  year: 2018
  ident: 119_CR6
  publication-title: Psychometrika
  doi: 10.1007/s11336-017-9600-y
– volume-title: A matrix handbook for statisticians
  year: 2008
  ident: 119_CR19
– volume: 20
  start-page: 141
  year: 1960
  ident: 119_CR14
  publication-title: Educational and Psychological Measurements
  doi: 10.1177/001316446002000116
– volume: 11
  start-page: 563
  year: 2017
  ident: 119_CR21
  publication-title: Advances in Data Analysis and Classification
  doi: 10.1007/s11634-016-0263-9
– start-page: 1
  volume-title: Applied matrix and tensor variate data analysis
  year: 2016
  ident: 119_CR2
– volume: 32
  start-page: 443
  year: 1967
  ident: 119_CR13
  publication-title: Psychometrika
  doi: 10.1007/BF02289658
– volume-title: Analysis of multivariate and high-dimensional data
  year: 2014
  ident: 119_CR15
– volume: 19
  start-page: 149
  year: 1954
  ident: 119_CR11
  publication-title: Psychometrika
  doi: 10.1007/BF02289162
– ident: 119_CR3
  doi: 10.1002/wics.1458
– volume-title: Data clustering: Theory, algorithms, and applications
  year: 2007
  ident: 119_CR10
  doi: 10.1137/1.9780898718348
– volume: 39
  start-page: 1
  year: 1977
  ident: 119_CR9
  publication-title: Journal of the Royal Statistical Society: Series B
  doi: 10.1111/j.2517-6161.1977.tb01600.x
– volume: 25
  start-page: 863
  year: 2015
  ident: 119_CR12
  publication-title: Statistics and Computing
  doi: 10.1007/s11222-014-9458-0
– volume: 99
  start-page: 189
  year: 2016
  ident: 119_CR20
  publication-title: Computational Statistics & Data Analysis
  doi: 10.1016/j.csda.2016.01.012
– volume: 47
  start-page: 69
  year: 1982
  ident: 119_CR18
  publication-title: Psychometrika
  doi: 10.1007/BF02293851
– volume-title: The measurement of meaning
  year: 1957
  ident: 119_CR17
– volume: 53
  start-page: 3194
  year: 2009
  ident: 119_CR22
  publication-title: Computational Statistics & Data Analysis
  doi: 10.1016/j.csda.2008.05.028
– volume: 17
  start-page: 763
  year: 2001
  ident: 119_CR24
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/17.9.763
– volume-title: Latent variable models and factor analysis: A unified approach (Third Edition)
  year: 2011
  ident: 119_CR8
  doi: 10.1002/9781119970583
– start-page: 257
  volume-title: Handbook of statistics, vol. 26: Psychometrics
  year: 2007
  ident: 119_CR23
SSID ssj0002734728
Score 2.15156
Snippet Vichi (Advances in Data Analysis and Classification, 11:563–591, 2017) proposed disjoint factor analysis (DFA), which is a factor analysis procedure subject to...
SourceID crossref
springer
SourceType Index Database
Publisher
StartPage 427
SubjectTerms Chemistry and Earth Sciences
Computer Science
Economics
Finance
Health Sciences
Humanities
Insurance
Law
Management
Mathematics and Statistics
Medicine
Original Paper
Physics
Statistical Theory and Methods
Statistics
Statistics and Computing/Statistics Programs
Statistics for Business
Statistics for Engineering
Statistics for Life Sciences
Statistics for Social Sciences
Title High-dimensional disjoint factor analysis with its EM algorithm version
URI https://link.springer.com/article/10.1007/s42081-021-00119-x
Volume 4
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVLSH
  databaseName: SpringerLink Journals
  customDbUrl:
  mediaType: online
  eissn: 2520-8764
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002734728
  issn: 2520-8756
  databaseCode: AFBBN
  dateStart: 20180601
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fT9swED5tRWy8bJBtGmNMfuAty0QcJ3EfAZUhtPICSLxVju1sTKNFtOzXX78720nadUxjL1XrWm2a-3q-s7_7DmDHcF1mZVUnKhV5InBOUuUckxUMNTQhqnKJ4vCkODoXxxf5RUfIdNUls-qd_vnHupL_sSqOoV2pSvYelm0_FAfwOdoXH9HC-PhPNiaSRmJIn99ra9Bxy-fJJcktuTY6sWokR9x2K50RDIax-vJxcoOvr-KvfrNsIUDFxZOaUs5LSlDR0ZyeM5FK47BydkcYjhVwjAvhj9uOA0BMTedKJp_s5fwGA09bMuriBmN8Sj26MSmwnYviOSafmPEEMev5Ma9N3vhYsQQl7y-FFwYIS6_woptLXt0TOabEBMCroyskpbrke7eGtczCVoXZTR7hZMfa61OescLR8-_2YGXvcH__pN2JI4Gf0jXhbX9OqK5yNZZL37oYwSwen7uo5GwdnoR0gu15bGzAAzuO4GnTqoMFzx3B46YAfRrBww_qWwSPhoFVEcGqowHTW2unramfwfvf4cUaeDEPL9bAixG8GMKLDYashRcL8HoO54eDs4OjJPTdSDQGe7NEoncxvF9UBZeqLwttBcatVhiVGp5nKpVGErcHXX-NwwU3RlptteyXss5yk72A3ngyti-B6VpaWardUmdS1NSrtFZc1qXKTVrL0m5C3NzL0bWXVxndbcJNeNvc7lH4F0z_Mv3V_aZvwVoH_9fQm93c2m2MOGfVmwCZXxfsfMQ
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=High-dimensional+disjoint+factor+analysis+with+its+EM+algorithm+version&rft.jtitle=Japanese+journal+of+statistics+and+data+science&rft.au=Cai%2C+Jingyu&rft.au=Adachi%2C+Kohei&rft.date=2021-07-01&rft.pub=Springer+Singapore&rft.issn=2520-8756&rft.eissn=2520-8764&rft.volume=4&rft.issue=1&rft.spage=427&rft.epage=448&rft_id=info:doi/10.1007%2Fs42081-021-00119-x&rft.externalDocID=10_1007_s42081_021_00119_x
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2520-8756&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2520-8756&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2520-8756&client=summon