Variational Bayes estimation of hierarchical Dirichlet-multinomial mixtures for text clustering

In this paper, we formulate a hierarchical Bayesian version of the Mixture of Unigrams model for text clustering and approach its posterior inference through variational inference. We compute the explicit expression of the variational objective function for our hierarchical model under a mean-field...

Full description

Saved in:
Bibliographic Details
Published inComputational statistics Vol. 38; no. 4; pp. 2015 - 2051
Main Authors Bilancia, Massimo, Di Nanni, Michele, Manca, Fabio, Pio, Gianvito
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.12.2023
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0943-4062
1613-9658
1613-9658
DOI10.1007/s00180-023-01350-8

Cover

Abstract In this paper, we formulate a hierarchical Bayesian version of the Mixture of Unigrams model for text clustering and approach its posterior inference through variational inference. We compute the explicit expression of the variational objective function for our hierarchical model under a mean-field approximation. We then derive the update equations of a suitable algorithm based on coordinate ascent to find local maxima of the variational target, and estimate the model parameters through the optimized variational hyperparameters. The advantages of variational algorithms over traditional Markov Chain Monte Carlo methods based on iterative posterior sampling are also discussed in detail.
AbstractList In this paper, we formulate a hierarchical Bayesian version of the Mixture of Unigrams model for text clustering and approach its posterior inference through variational inference. We compute the explicit expression of the variational objective function for our hierarchical model under a mean-field approximation. We then derive the update equations of a suitable algorithm based on coordinate ascent to find local maxima of the variational target, and estimate the model parameters through the optimized variational hyperparameters. The advantages of variational algorithms over traditional Markov Chain Monte Carlo methods based on iterative posterior sampling are also discussed in detail.
Author Bilancia, Massimo
Manca, Fabio
Di Nanni, Michele
Pio, Gianvito
Author_xml – sequence: 1
  givenname: Massimo
  orcidid: 0000-0002-5330-2403
  surname: Bilancia
  fullname: Bilancia, Massimo
  email: massimo.bilancia@uniba.it
  organization: Department of Precision and Regenerative Medicine and Ionian Area (DiMePRe-J), University of Bari Aldo Moro, Policlinic University Hospital
– sequence: 2
  givenname: Michele
  surname: Di Nanni
  fullname: Di Nanni, Michele
  organization: EY Business and Technology Solution
– sequence: 3
  givenname: Fabio
  surname: Manca
  fullname: Manca, Fabio
  organization: Department of Education, Psychology, Communication (ForPsiCom), University of Bari Aldo Moro, Palazzo Chiaia - Napolitano
– sequence: 4
  givenname: Gianvito
  surname: Pio
  fullname: Pio, Gianvito
  organization: Department of Computer Science, University of Bari Aldo Moro
BookMark eNqNkMtKAzEUhoNUsK2-gKsB19FcZzJLrVcouFG3IZPJ2JS5mWSwfXvTTkFwUVwdOPzf4T_fDEzarjUAXGJ0jRHKbjxCWCCICIUIU46gOAFTnGIK85SLCZiinFHIUErOwMz7NUKEZARPgfxQzqpgu1bVyZ3aGp8YH2yzXyVdlaysccrpldUxcG-d1avaBNgMdbBt19i4bewmDC6SVeeSYDYh0fXgg3G2_TwHp5Wqvbk4zDl4f3x4WzzD5evTy-J2CTVNaYCMUYpzw7jGNDYtCCsLnqamUCY1XBRlXjFWlIqWgnNTaMI4w4QLlmWsJFrQOaDj3aHt1fZb1bXsXXzDbSVGcudIjo5kdCT3juSOuhqp3nVfQ3xcrrvBRRVeEiHyLI9tcEyRMaVd570z1f9Oiz-QtmFvNThl6-Po4Rff7xQa99vqCPUDI8KawA
CitedBy_id crossref_primary_10_1016_j_joi_2024_101633
crossref_primary_10_1186_s40537_024_00930_9
crossref_primary_10_1007_s11135_022_01460_3
crossref_primary_10_1109_ACCESS_2024_3385628
Cites_doi 10.1214/07-AOAS114
10.1007/978-3-319-16181-5_39
10.1201/9780429055911
10.1007/s11222-014-9500-2
10.1111/1467-9868.00265
10.1007/s11222-011-9236-1
10.2307/2288938
10.18637/jss.v050.i10
10.1145/183422.183423
10.1080/01621459.2017.1285773
10.1080/00437956.1954.11659520
10.1038/nature14541
10.1007/978-1-4614-3223-4
10.1093/imanum/draa038
10.1145/2133806.2133826
10.1017/CBO9780511809071
10.1093/biomet/49.1-2.65
10.1561/2200000001
10.1214/06-BA121
10.1109/TPAMI.2018.2889774
10.1145/1143844.1143892
10.1023/A:1007665907178
10.1023/A:1007612920971
10.1080/03610926.2021.1921214
10.1007/978-3-030-51310-8_10
10.1023/A:1007692713085
10.1007/s11634-020-00399-3
10.1093/biostatistics/kxy018
10.1111/j.2517-6161.1994.tb01985.x
10.1007/978-0-387-84858-7
10.1007/978-0-387-35768-3
10.1111/j.1368-423X.2004.00125.x
10.1007/0-387-71599-1
10.3390/e22111263
10.1080/01621459.2000.10474285
10.18637/jss.v025.i05
10.1007/s11222-015-9561-x
10.1007/s40745-015-0040-1
10.1201/b17520
10.1080/10618600.2014.983643
10.1201/b16018
10.1214/20-EJS1756
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
DBID AAYXX
CITATION
3V.
7SC
7TB
7WY
7WZ
7XB
87Z
88I
8AL
8C1
8FD
8FE
8FG
8FK
8FL
8G5
ABJCF
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
DWQXO
FR3
FRNLG
FYUFA
F~G
GHDGH
GNUQQ
GUQSH
HCIFZ
JQ2
K60
K6~
K7-
KR7
L.-
L6V
L7M
L~C
L~D
M0C
M0N
M2O
M2P
M7S
MBDVC
P5Z
P62
PHGZM
PHGZT
PJZUB
PKEHL
PPXIY
PQBIZ
PQBZA
PQEST
PQGLB
PQQKQ
PQUKI
PTHSS
Q9U
ADTOC
UNPAY
DOI 10.1007/s00180-023-01350-8
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
Mechanical & Transportation Engineering Abstracts
ABI/INFORM Collection
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
Science Database (Alumni Edition)
Computing Database (Alumni Edition)
Public Health Database
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni)
Research Library
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Advanced Technologies & Aerospace Database
ProQuest Central Essentials
ProQuest Central
Business Premium Collection
Technology Collection
ProQuest One Community College
ProQuest Central
Engineering Research Database
Business Premium Collection (Alumni)
Health Research Premium Collection
ABI/INFORM Global (Corporate)
Health Research Premium Collection (Alumni)
ProQuest Central Student
Research Library Prep
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
Computer Science Database (Proquest)
Civil Engineering Abstracts
ABI/INFORM Professional Advanced
ProQuest Engineering Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ABI/INFORM Global
Computing Database
Research Library
ProQuest Science Database
Engineering Database (Proquest)
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Business
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
Engineering Collection
ProQuest Central Basic
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
ProQuest Business Collection (Alumni Edition)
Research Library Prep
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ABI/INFORM Complete
ProQuest One Applied & Life Sciences
Health Research Premium Collection
Health & Medical Research Collection
ProQuest Central (New)
Engineering Collection
Advanced Technologies & Aerospace Collection
Business Premium Collection
ABI/INFORM Global
Engineering Database
ProQuest Science Journals (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
ProQuest Business Collection
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
ProQuest One Academic (New)
ABI/INFORM Global (Corporate)
ProQuest One Business
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
Mechanical & Transportation Engineering Abstracts
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
Research Library (Alumni Edition)
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest Health & Medical Research Collection
ProQuest Engineering Collection
ProQuest Central Korea
ProQuest Research Library
Advanced Technologies Database with Aerospace
ABI/INFORM Complete (Alumni Edition)
Civil Engineering Abstracts
ProQuest Computing
ProQuest Public Health
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Science Journals
ProQuest Computing (Alumni Edition)
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
Materials Science & Engineering Collection
ProQuest One Business (Alumni)
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList ProQuest Business Collection (Alumni Edition)

Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Statistics
Mathematics
EISSN 1613-9658
EndPage 2051
ExternalDocumentID oai:ricerca.uniba.it:11586/429445
10_1007_s00180_023_01350_8
GroupedDBID -5D
-5G
-BR
-EM
-Y2
-~C
.86
.VR
06D
0R~
0VY
199
1N0
203
29F
2J2
2JN
2JY
2KG
2LR
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
53G
5GY
5VS
67Z
6NX
78A
7WY
88I
8C1
8FE
8FG
8FL
8G5
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTV
ABHLI
ABHQN
ABJCF
ABJNI
ABJOX
ABKCH
ABKTR
ABLJU
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFS
ACGOD
ACHSB
ACHXU
ACIWK
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACSNA
ACZOJ
ADBBV
ADHHG
ADHIR
ADINQ
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALIPV
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BAPOH
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EIOEI
EJD
ESBYG
F5P
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
FYUFA
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HLICF
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
H~9
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IXE
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
L6V
LAS
LLZTM
M0C
M0N
M2O
M2P
M4Y
M7S
MA-
MK~
N2Q
N9A
NB0
NPVJJ
NQJWS
NU0
O9-
O93
O9J
OAM
P2P
P62
P9R
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PTHSS
Q2X
QOS
R89
R9I
RNS
ROL
RPX
RSV
S16
S1Z
S27
S3B
SAP
SDH
SHX
SISQX
SJYHP
SMT
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
TSG
TSK
TSV
TUC
U2A
UG4
UKHRP
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7X
Z7Y
Z81
Z83
Z88
ZMTXR
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
ADKFA
AEZWR
AFDZB
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
AMVHM
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PJZUB
PPXIY
PQGLB
PUEGO
7SC
7TB
7XB
8AL
8FD
8FK
FR3
JQ2
KR7
L.-
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQUKI
Q9U
ADTOC
UNPAY
ID FETCH-LOGICAL-c363t-443319e45c13943b24db566ebae6e58bd9f44bda3d855ebc245412584774d2c83
IEDL.DBID UNPAY
ISSN 0943-4062
1613-9658
IngestDate Sun Oct 26 04:14:22 EDT 2025
Fri Jul 25 19:22:07 EDT 2025
Wed Oct 01 05:00:55 EDT 2025
Thu Apr 24 23:03:00 EDT 2025
Fri Feb 21 02:41:47 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Keywords Text clustering
Finite mixture models
Variational inference
Dirichlet-multinomial distribution
Bayesian hierarchical modelling
Language English
License cc-by-nc-nd
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c363t-443319e45c13943b24db566ebae6e58bd9f44bda3d855ebc245412584774d2c83
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-5330-2403
OpenAccessLink https://proxy.k.utb.cz/login?url=https://hdl.handle.net/11586/429445
PQID 2889791391
PQPubID 54096
PageCount 37
ParticipantIDs unpaywall_primary_10_1007_s00180_023_01350_8
proquest_journals_2889791391
crossref_primary_10_1007_s00180_023_01350_8
crossref_citationtrail_10_1007_s00180_023_01350_8
springer_journals_10_1007_s00180_023_01350_8
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20231200
2023-12-00
20231201
PublicationDateYYYYMMDD 2023-12-01
PublicationDate_xml – month: 12
  year: 2023
  text: 20231200
PublicationDecade 2020
PublicationPlace Berlin/Heidelberg
PublicationPlace_xml – name: Berlin/Heidelberg
– name: Heidelberg
PublicationTitle Computational statistics
PublicationTitleAbbrev Comput Stat
PublicationYear 2023
Publisher Springer Berlin Heidelberg
Springer Nature B.V
Publisher_xml – name: Springer Berlin Heidelberg
– name: Springer Nature B.V
References Zhang C, Kjellström H (2015) How to supervise topic models. In: Agapito L, Bronstein MM, Rother C (eds) Computer vision—ECCv 2014 workshops. Springer, Cham, pp 500–515. https://doi.org/10.1007/978-3-319-16181-5_39
R Core Team (2022) R: a language and environment for statistical computing. https://www.R-project.org
Nielsen F, Garcia V (2009) Statistical exponential families: a digest with flash cards. arXiv:0911.4863
TitteringtonDMWangBConvergence properties of a general algorithm for calculating variational Bayesian estimates for a Normal mixture modelBayesian Anal2006222129110.1214/06-BA1211331.62168
ManningCDRaghavanPSchützeHIntroduction to information retrieval2008CambridgeCambridge University Press10.1017/CBO97805118090711160.68008
CeleuxGFrüwirth-SchnatterSRobertCPFrühwirth-SchnatterSCeleuxGRobertCPModel selection for mixture models—perspectives and strategiesHandbook of mixture analysis2018New YorkChapmann & Hall11815410.1201/9780429055911
AnderlucciLViroliCMixtures of Dirichlet-multinomial distributions for supervised and unsupervised classification of short text dataAdv Data Anal Classif202014759770420157210.1007/s11634-020-00399-31459.62102
CeleuxGHurnMRobertCPComputational and inferential difficulties with mixture posterior distributionsJ Am Stat Assoc200095957970180445010.1080/01621459.2000.104742850999.62020
PolliceABilanciaMA hierarchical finite mixture model for Bayesian classification in the presence of auxiliary informationMetron Int J Stat2000LVIII10913118542250999.62504
Greene D, Cunningham P (2006) Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings of the 23rd international conference on machine learning (ICML’06). ACM Press, pp 377–384
AggarwalCCZhaiCMining text data2012New YorkSpringer10.1007/978-1-4614-3223-4
BlanchardPHighamDJHighamNJAccurately computing the log-sum-exp and softmax functionsIMA J Numer Anal20214123112330432838510.1093/imanum/draa0381509.65019
HastieTTibshiraniRFriedmanJThe elements of statistical learning20092New YorkSpringer10.1007/978-0-387-84858-71273.62005
KeribinCConsistent estimation of the order of mixture modelsSankhyā Indian J Stat Ser A (1961–2002)200062496617697351081.62516
XuDTianYA comprehensive survey of clustering algorithmsAnn Data Sci2015216519310.1007/s40745-015-0040-1
MarinJMRobertCApproximating the marginal likelihood in mixture modelsIndian Bayesian Soc Newslett2008527
van der MaatenLHintonGVisualizing data using t-SNEJ Mach Learn Res20089257926051225.68219
AnastasiuDCTagarelliAKarypisGAggarwalCCReddyCKDocument clustering: the next frontierData clustering: algorithms and applications2014Boca RatonChapman & Hall305338
BaudryJPMaugisCMichelBSlope heuristics: overview and implementationStat Comput201222455470286502910.1007/s11222-011-9236-11322.62007
SankaranKHolmesSPLatent variable modeling for the microbiomeBiostatistics201920599614401972010.1093/biostatistics/kxy018
Frühwirth-SchnatterSEstimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniquesEconom J20047143167207663010.1111/j.1368-423X.2004.00125.x1053.62087
Malsiner-WalliGFrühwirth-SchnatterSGrünBModel-based clustering based on sparse finite Gaussian mixturesStat Comput201626303324343937510.1007/s11222-014-9500-21342.62109
ZhangCButepageJKjellstromHAdvances in variational inferenceIEEE Trans Pattern Anal Mach Intell2019412008202610.1109/TPAMI.2018.2889774
DaytonCMMacreadyGBConcomitant-variable latent-class modelsJ Am Stat Assoc19888317394101410.2307/2288938
JordanMIGhahramaniZJaakkolaTSAn introduction to variational methods for graphical modelsMach Learn19993718323310.1023/A:10076659071780945.68164
Silverman J (2022) RcppHungarian: solves minimum cost bipartite matching problems. https://CRAN.R-project.org/package=RcppHungarian, R package version 0.2
FeinererIHornikKMeyerDText mining infrastructure in RJ Stati Softw200810.18637/jss.v025.i05
Kaggle (2022) Sports dataset(bbc). https://www.kaggle.com/datasets/maneesh99/sports-datasetbbc. Accessed 04 Nov 2022
BleiDMKucukelbirAMcAuliffeJDVariational inference: a review for statisticiansJ Am Stat Assoc2017112859877367177610.1080/01621459.2017.1285773
Frühwirth-SchnatterSFinite mixture and Markov switching models2006New YorkSpringer10.1007/978-0-387-35768-31108.62002
KunkelDPeruggiaMAnchored Bayesian Gaussian mixture modelsElectron J Stat2020416549610.1214/20-EJS17561452.62453
LiHFanXA pivotal allocation-based algorithm for solving the label-switching problem in Bayesian mixture modelsJ Comput Graph Stat201625266283347404710.1080/10618600.2014.983643
Andrews N, Fox E (2007) Recent developments in document clustering. http://hdl.handle.net/10919/19473, Virginia Tech computer science technical report, TR-07-35
StephensMDealing with label switching in mixture modelsJ R Stat Soc Ser B (Stat Methodol)200062795809179629310.1111/1467-9868.002650957.62020
CeleuxGKamaryKMalsiner-WalliGFrühwirth-SchnatterSCeleuxGRobertCPComputational solutions for Bayesian inference in mixture modelsHandbook of mixture analysis2018New YorkChapmann & Hall7311510.1201/9780429055911
GelmanACarlinJSternHBayesian data analysis20133Boca RatonChapman and Hall10.1201/b160180914.62018
Maechler M (2022) Rmpfr: R mpfr—multiple precision floating-point reliable. https://cran.r-project.org/package=Rmpfr, R package version 0.8-9
Feinerer I, Hornik K (2020) tm: text mining package. https://CRAN.R-project.org/package=tm, R package version 0.7-8
RakibMRHZehNJankowskaMMétaisEMezianeFHoracekHEnhancement of short text clustering by iterative classificationNatural language processing and information systems2020BerlinSpringer10511710.1007/978-3-030-51310-8_10
WallachHMimnoDMcCallumABengioYSchuurmansDLaffertyJRethinking LDA: why priors matterAdvances in neural information processing systems2009New YorkCurran Associates Inc.
RobertCPThe Bayesian choice2007New YorkSpringer10.1007/0-387-71599-1
WainwrightMJJordanMIGraphical models, exponential families, and variational inferenceFound Trends® Mach Learn20071130510.1561/22000000011193.62107
HornikKFeinererIKoberMSpherical k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}-means clusteringJ Stat Softw201210.18637/jss.v050.i10
Tran MN, Nguyen TN, Dao VH (2021) A practical tutorial on variational Bayes. arXiv:2103.01327
BaudryJPCeleuxGEM for mixtures. Inizialiation requires special careStat Comput201525713726336048710.1007/s11222-015-9561-x1331.62301
BleiDMNgAYJordanMILatent Dirichlet allocationJ Mach Learn Res2003399310221112.68379
Chandra NK, Canale A, Dunson DB (2020) Escaping the curse of dimensionality in Bayesian model based clustering. arxiv:2006.02700
HarrisZSDistributional structureWORD19541014616210.1080/00437956.1954.11659520
BleiDMLaffertyJDA correlated topic model of scienceAnn Appl Stat2007239383910.1214/07-AOAS1141129.62122
BleiDMProbabilistic topic modelsCommun ACM201255778410.1145/2133806.2133826
MosimannJEOn the compound multinomial distribution, the multivariate Beta-distribution, and correlations among proportionsBiometrika196249658214329910.1093/biomet/49.1-2.650105.12502
DieboltJRobertCPEstimation of finite mixture distributions through Bayesian samplingJ R Stat Soc Ser B (Methodol)199456363375128194010.1111/j.2517-6161.1994.tb01985.x0796.62028
LeeSYGibbs sampler and coordinate ascent variational inference: a set-theoretical reviewCommun Stat Theory Methods202110.1080/03610926.2021.1921214
PlummerSPatiDBhattacharyaADynamics of coordinate ascent variational inference: a case study in 2D Ising modelsEntropy2020221263422206610.3390/e22111263
GhahramaniZProbabilistic machine learning and artificial intelligenceNature201552145245910.1038/nature14541
NigamKMccallumAKThrunSText classification from labeled and unlabeled documents using EMMach Learn20003910313410.1023/A:10076927130850949.68162
AiroldiEMBleiDEroshevaEAHandbook of mixed membership models and their applications2014Boca RatonChapman and Hall10.1201/b175201369.62003
MurphyKPMachine learning: a probabilistic perspective2012CambridgeThe MIT Press1295.68003
AptéCDamerauFWeissSMAutomated learning of decision rules for text categorizationACM Trans Inf Syst19941223325110.1145/183422.183423
AwasthiPRisteskiACortesCLawrenceNLeeDOn some provably correct cases of variational inference for topic modelsAdvances in neural information processing systems2015New YorkCurran Associates, Inc.
DhillonISModhaDSConcept decompositions for large sparse text data using clusteringMach Learn20014214317510.1023/A:10076129209710970.68167
Nikita M (2020) ldatuning: tuning of the latent Dirichlet allocation models parameters. https://CRAN.R-project.org/package=ldatuning, R package version 1.0.2
DM Blei (1350_CR14) 2017; 112
DM Blei (1350_CR13) 2003; 3
SY Lee (1350_CR36) 2021
G Celeux (1350_CR16) 2018
1350_CR46
D Xu (1350_CR60) 2015; 2
1350_CR44
1350_CR49
K Sankaran (1350_CR52) 2019; 20
S Frühwirth-Schnatter (1350_CR25) 2006
MJ Wainwright (1350_CR58) 2007; 1
ZS Harris (1350_CR29) 1954; 10
M Stephens (1350_CR54) 2000; 62
DM Blei (1350_CR12) 2007
D Kunkel (1350_CR35) 2020
1350_CR5
L Anderlucci (1350_CR4) 2020; 14
CP Robert (1350_CR51) 2007
S Frühwirth-Schnatter (1350_CR24) 2004; 7
CD Manning (1350_CR40) 2008
G Celeux (1350_CR17) 2018
P Blanchard (1350_CR10) 2021; 41
A Gelman (1350_CR26) 2013
1350_CR33
1350_CR38
Z Ghahramani (1350_CR27) 2015; 521
DC Anastasiu (1350_CR3) 2014
C Apté (1350_CR6) 1994; 12
JM Marin (1350_CR41) 2008; 5
C Keribin (1350_CR34) 2000; 62
KP Murphy (1350_CR43) 2012
1350_CR62
1350_CR23
A Pollice (1350_CR48) 2000; LVIII
G Celeux (1350_CR15) 2000; 95
1350_CR28
L van der Maaten (1350_CR57) 2008; 9
JP Baudry (1350_CR8) 2015; 25
JP Baudry (1350_CR9) 2012; 22
K Nigam (1350_CR45) 2000; 39
P Awasthi (1350_CR7) 2015
MI Jordan (1350_CR32) 1999; 37
DM Blei (1350_CR11) 2012; 55
T Hastie (1350_CR30) 2009
G Malsiner-Walli (1350_CR39) 2016; 26
H Wallach (1350_CR59) 2009
H Li (1350_CR37) 2016; 25
DM Titterington (1350_CR55) 2006
EM Airoldi (1350_CR2) 2014
J Diebolt (1350_CR21) 1994; 56
K Hornik (1350_CR31) 2012
1350_CR53
IS Dhillon (1350_CR20) 2001; 42
1350_CR56
CC Aggarwal (1350_CR1) 2012
1350_CR18
MRH Rakib (1350_CR50) 2020
C Zhang (1350_CR61) 2019; 41
CM Dayton (1350_CR19) 1988; 83
I Feinerer (1350_CR22) 2008
JE Mosimann (1350_CR42) 1962; 49
S Plummer (1350_CR47) 2020; 22
References_xml – reference: FeinererIHornikKMeyerDText mining infrastructure in RJ Stati Softw200810.18637/jss.v025.i05
– reference: Maechler M (2022) Rmpfr: R mpfr—multiple precision floating-point reliable. https://cran.r-project.org/package=Rmpfr, R package version 0.8-9
– reference: Zhang C, Kjellström H (2015) How to supervise topic models. In: Agapito L, Bronstein MM, Rother C (eds) Computer vision—ECCv 2014 workshops. Springer, Cham, pp 500–515. https://doi.org/10.1007/978-3-319-16181-5_39
– reference: BaudryJPMaugisCMichelBSlope heuristics: overview and implementationStat Comput201222455470286502910.1007/s11222-011-9236-11322.62007
– reference: Chandra NK, Canale A, Dunson DB (2020) Escaping the curse of dimensionality in Bayesian model based clustering. arxiv:2006.02700
– reference: HarrisZSDistributional structureWORD19541014616210.1080/00437956.1954.11659520
– reference: PlummerSPatiDBhattacharyaADynamics of coordinate ascent variational inference: a case study in 2D Ising modelsEntropy2020221263422206610.3390/e22111263
– reference: AwasthiPRisteskiACortesCLawrenceNLeeDOn some provably correct cases of variational inference for topic modelsAdvances in neural information processing systems2015New YorkCurran Associates, Inc.
– reference: Nikita M (2020) ldatuning: tuning of the latent Dirichlet allocation models parameters. https://CRAN.R-project.org/package=ldatuning, R package version 1.0.2
– reference: MarinJMRobertCApproximating the marginal likelihood in mixture modelsIndian Bayesian Soc Newslett2008527
– reference: ZhangCButepageJKjellstromHAdvances in variational inferenceIEEE Trans Pattern Anal Mach Intell2019412008202610.1109/TPAMI.2018.2889774
– reference: BleiDMLaffertyJDA correlated topic model of scienceAnn Appl Stat2007239383910.1214/07-AOAS1141129.62122
– reference: KunkelDPeruggiaMAnchored Bayesian Gaussian mixture modelsElectron J Stat2020416549610.1214/20-EJS17561452.62453
– reference: AptéCDamerauFWeissSMAutomated learning of decision rules for text categorizationACM Trans Inf Syst19941223325110.1145/183422.183423
– reference: RakibMRHZehNJankowskaMMétaisEMezianeFHoracekHEnhancement of short text clustering by iterative classificationNatural language processing and information systems2020BerlinSpringer10511710.1007/978-3-030-51310-8_10
– reference: DhillonISModhaDSConcept decompositions for large sparse text data using clusteringMach Learn20014214317510.1023/A:10076129209710970.68167
– reference: StephensMDealing with label switching in mixture modelsJ R Stat Soc Ser B (Stat Methodol)200062795809179629310.1111/1467-9868.002650957.62020
– reference: JordanMIGhahramaniZJaakkolaTSAn introduction to variational methods for graphical modelsMach Learn19993718323310.1023/A:10076659071780945.68164
– reference: R Core Team (2022) R: a language and environment for statistical computing. https://www.R-project.org/
– reference: Kaggle (2022) Sports dataset(bbc). https://www.kaggle.com/datasets/maneesh99/sports-datasetbbc. Accessed 04 Nov 2022
– reference: Nielsen F, Garcia V (2009) Statistical exponential families: a digest with flash cards. arXiv:0911.4863
– reference: Silverman J (2022) RcppHungarian: solves minimum cost bipartite matching problems. https://CRAN.R-project.org/package=RcppHungarian, R package version 0.2
– reference: LeeSYGibbs sampler and coordinate ascent variational inference: a set-theoretical reviewCommun Stat Theory Methods202110.1080/03610926.2021.1921214
– reference: GelmanACarlinJSternHBayesian data analysis20133Boca RatonChapman and Hall10.1201/b160180914.62018
– reference: van der MaatenLHintonGVisualizing data using t-SNEJ Mach Learn Res20089257926051225.68219
– reference: DaytonCMMacreadyGBConcomitant-variable latent-class modelsJ Am Stat Assoc19888317394101410.2307/2288938
– reference: XuDTianYA comprehensive survey of clustering algorithmsAnn Data Sci2015216519310.1007/s40745-015-0040-1
– reference: AggarwalCCZhaiCMining text data2012New YorkSpringer10.1007/978-1-4614-3223-4
– reference: SankaranKHolmesSPLatent variable modeling for the microbiomeBiostatistics201920599614401972010.1093/biostatistics/kxy018
– reference: TitteringtonDMWangBConvergence properties of a general algorithm for calculating variational Bayesian estimates for a Normal mixture modelBayesian Anal2006222129110.1214/06-BA1211331.62168
– reference: WainwrightMJJordanMIGraphical models, exponential families, and variational inferenceFound Trends® Mach Learn20071130510.1561/22000000011193.62107
– reference: AnastasiuDCTagarelliAKarypisGAggarwalCCReddyCKDocument clustering: the next frontierData clustering: algorithms and applications2014Boca RatonChapman & Hall305338
– reference: BaudryJPCeleuxGEM for mixtures. Inizialiation requires special careStat Comput201525713726336048710.1007/s11222-015-9561-x1331.62301
– reference: BlanchardPHighamDJHighamNJAccurately computing the log-sum-exp and softmax functionsIMA J Numer Anal20214123112330432838510.1093/imanum/draa0381509.65019
– reference: AiroldiEMBleiDEroshevaEAHandbook of mixed membership models and their applications2014Boca RatonChapman and Hall10.1201/b175201369.62003
– reference: KeribinCConsistent estimation of the order of mixture modelsSankhyā Indian J Stat Ser A (1961–2002)200062496617697351081.62516
– reference: Frühwirth-SchnatterSFinite mixture and Markov switching models2006New YorkSpringer10.1007/978-0-387-35768-31108.62002
– reference: DieboltJRobertCPEstimation of finite mixture distributions through Bayesian samplingJ R Stat Soc Ser B (Methodol)199456363375128194010.1111/j.2517-6161.1994.tb01985.x0796.62028
– reference: RobertCPThe Bayesian choice2007New YorkSpringer10.1007/0-387-71599-1
– reference: HastieTTibshiraniRFriedmanJThe elements of statistical learning20092New YorkSpringer10.1007/978-0-387-84858-71273.62005
– reference: BleiDMProbabilistic topic modelsCommun ACM201255778410.1145/2133806.2133826
– reference: CeleuxGHurnMRobertCPComputational and inferential difficulties with mixture posterior distributionsJ Am Stat Assoc200095957970180445010.1080/01621459.2000.104742850999.62020
– reference: CeleuxGKamaryKMalsiner-WalliGFrühwirth-SchnatterSCeleuxGRobertCPComputational solutions for Bayesian inference in mixture modelsHandbook of mixture analysis2018New YorkChapmann & Hall7311510.1201/9780429055911
– reference: NigamKMccallumAKThrunSText classification from labeled and unlabeled documents using EMMach Learn20003910313410.1023/A:10076927130850949.68162
– reference: WallachHMimnoDMcCallumABengioYSchuurmansDLaffertyJRethinking LDA: why priors matterAdvances in neural information processing systems2009New YorkCurran Associates Inc.
– reference: Feinerer I, Hornik K (2020) tm: text mining package. https://CRAN.R-project.org/package=tm, R package version 0.7-8
– reference: GhahramaniZProbabilistic machine learning and artificial intelligenceNature201552145245910.1038/nature14541
– reference: CeleuxGFrüwirth-SchnatterSRobertCPFrühwirth-SchnatterSCeleuxGRobertCPModel selection for mixture models—perspectives and strategiesHandbook of mixture analysis2018New YorkChapmann & Hall11815410.1201/9780429055911
– reference: HornikKFeinererIKoberMSpherical k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}-means clusteringJ Stat Softw201210.18637/jss.v050.i10
– reference: BleiDMKucukelbirAMcAuliffeJDVariational inference: a review for statisticiansJ Am Stat Assoc2017112859877367177610.1080/01621459.2017.1285773
– reference: Andrews N, Fox E (2007) Recent developments in document clustering. http://hdl.handle.net/10919/19473, Virginia Tech computer science technical report, TR-07-35
– reference: AnderlucciLViroliCMixtures of Dirichlet-multinomial distributions for supervised and unsupervised classification of short text dataAdv Data Anal Classif202014759770420157210.1007/s11634-020-00399-31459.62102
– reference: LiHFanXA pivotal allocation-based algorithm for solving the label-switching problem in Bayesian mixture modelsJ Comput Graph Stat201625266283347404710.1080/10618600.2014.983643
– reference: Malsiner-WalliGFrühwirth-SchnatterSGrünBModel-based clustering based on sparse finite Gaussian mixturesStat Comput201626303324343937510.1007/s11222-014-9500-21342.62109
– reference: MosimannJEOn the compound multinomial distribution, the multivariate Beta-distribution, and correlations among proportionsBiometrika196249658214329910.1093/biomet/49.1-2.650105.12502
– reference: ManningCDRaghavanPSchützeHIntroduction to information retrieval2008CambridgeCambridge University Press10.1017/CBO97805118090711160.68008
– reference: Tran MN, Nguyen TN, Dao VH (2021) A practical tutorial on variational Bayes. arXiv:2103.01327
– reference: PolliceABilanciaMA hierarchical finite mixture model for Bayesian classification in the presence of auxiliary informationMetron Int J Stat2000LVIII10913118542250999.62504
– reference: Greene D, Cunningham P (2006) Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings of the 23rd international conference on machine learning (ICML’06). ACM Press, pp 377–384
– reference: BleiDMNgAYJordanMILatent Dirichlet allocationJ Mach Learn Res2003399310221112.68379
– reference: Frühwirth-SchnatterSEstimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniquesEconom J20047143167207663010.1111/j.1368-423X.2004.00125.x1053.62087
– reference: MurphyKPMachine learning: a probabilistic perspective2012CambridgeThe MIT Press1295.68003
– year: 2007
  ident: 1350_CR12
  publication-title: Ann Appl Stat
  doi: 10.1214/07-AOAS114
– ident: 1350_CR62
  doi: 10.1007/978-3-319-16181-5_39
– start-page: 73
  volume-title: Handbook of mixture analysis
  year: 2018
  ident: 1350_CR17
  doi: 10.1201/9780429055911
– volume: 62
  start-page: 49
  year: 2000
  ident: 1350_CR34
  publication-title: Sankhyā Indian J Stat Ser A (1961–2002)
– volume: 26
  start-page: 303
  year: 2016
  ident: 1350_CR39
  publication-title: Stat Comput
  doi: 10.1007/s11222-014-9500-2
– volume: 62
  start-page: 795
  year: 2000
  ident: 1350_CR54
  publication-title: J R Stat Soc Ser B (Stat Methodol)
  doi: 10.1111/1467-9868.00265
– start-page: 305
  volume-title: Data clustering: algorithms and applications
  year: 2014
  ident: 1350_CR3
– volume: 22
  start-page: 455
  year: 2012
  ident: 1350_CR9
  publication-title: Stat Comput
  doi: 10.1007/s11222-011-9236-1
– volume: 5
  start-page: 2
  year: 2008
  ident: 1350_CR41
  publication-title: Indian Bayesian Soc Newslett
– volume: 83
  start-page: 173
  year: 1988
  ident: 1350_CR19
  publication-title: J Am Stat Assoc
  doi: 10.2307/2288938
– year: 2012
  ident: 1350_CR31
  publication-title: J Stat Softw
  doi: 10.18637/jss.v050.i10
– volume-title: Machine learning: a probabilistic perspective
  year: 2012
  ident: 1350_CR43
– volume: LVIII
  start-page: 109
  year: 2000
  ident: 1350_CR48
  publication-title: Metron Int J Stat
– volume: 12
  start-page: 233
  year: 1994
  ident: 1350_CR6
  publication-title: ACM Trans Inf Syst
  doi: 10.1145/183422.183423
– volume: 112
  start-page: 859
  year: 2017
  ident: 1350_CR14
  publication-title: J Am Stat Assoc
  doi: 10.1080/01621459.2017.1285773
– volume: 10
  start-page: 146
  year: 1954
  ident: 1350_CR29
  publication-title: WORD
  doi: 10.1080/00437956.1954.11659520
– ident: 1350_CR38
– volume: 521
  start-page: 452
  year: 2015
  ident: 1350_CR27
  publication-title: Nature
  doi: 10.1038/nature14541
– volume-title: Mining text data
  year: 2012
  ident: 1350_CR1
  doi: 10.1007/978-1-4614-3223-4
– volume: 41
  start-page: 2311
  year: 2021
  ident: 1350_CR10
  publication-title: IMA J Numer Anal
  doi: 10.1093/imanum/draa038
– volume: 55
  start-page: 77
  year: 2012
  ident: 1350_CR11
  publication-title: Commun ACM
  doi: 10.1145/2133806.2133826
– volume-title: Introduction to information retrieval
  year: 2008
  ident: 1350_CR40
  doi: 10.1017/CBO9780511809071
– volume: 49
  start-page: 65
  year: 1962
  ident: 1350_CR42
  publication-title: Biometrika
  doi: 10.1093/biomet/49.1-2.65
– volume: 1
  start-page: 1
  year: 2007
  ident: 1350_CR58
  publication-title: Found Trends® Mach Learn
  doi: 10.1561/2200000001
– year: 2006
  ident: 1350_CR55
  publication-title: Bayesian Anal
  doi: 10.1214/06-BA121
– volume: 41
  start-page: 2008
  year: 2019
  ident: 1350_CR61
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/TPAMI.2018.2889774
– ident: 1350_CR28
  doi: 10.1145/1143844.1143892
– volume: 37
  start-page: 183
  year: 1999
  ident: 1350_CR32
  publication-title: Mach Learn
  doi: 10.1023/A:1007665907178
– volume: 3
  start-page: 993
  year: 2003
  ident: 1350_CR13
  publication-title: J Mach Learn Res
– volume: 42
  start-page: 143
  year: 2001
  ident: 1350_CR20
  publication-title: Mach Learn
  doi: 10.1023/A:1007612920971
– ident: 1350_CR23
– ident: 1350_CR44
– year: 2021
  ident: 1350_CR36
  publication-title: Commun Stat Theory Methods
  doi: 10.1080/03610926.2021.1921214
– start-page: 105
  volume-title: Natural language processing and information systems
  year: 2020
  ident: 1350_CR50
  doi: 10.1007/978-3-030-51310-8_10
– volume: 39
  start-page: 103
  year: 2000
  ident: 1350_CR45
  publication-title: Mach Learn
  doi: 10.1023/A:1007692713085
– ident: 1350_CR56
– volume: 14
  start-page: 759
  year: 2020
  ident: 1350_CR4
  publication-title: Adv Data Anal Classif
  doi: 10.1007/s11634-020-00399-3
– start-page: 118
  volume-title: Handbook of mixture analysis
  year: 2018
  ident: 1350_CR16
  doi: 10.1201/9780429055911
– volume: 20
  start-page: 599
  year: 2019
  ident: 1350_CR52
  publication-title: Biostatistics
  doi: 10.1093/biostatistics/kxy018
– volume: 56
  start-page: 363
  year: 1994
  ident: 1350_CR21
  publication-title: J R Stat Soc Ser B (Methodol)
  doi: 10.1111/j.2517-6161.1994.tb01985.x
– ident: 1350_CR33
– volume-title: The elements of statistical learning
  year: 2009
  ident: 1350_CR30
  doi: 10.1007/978-0-387-84858-7
– volume-title: Finite mixture and Markov switching models
  year: 2006
  ident: 1350_CR25
  doi: 10.1007/978-0-387-35768-3
– volume: 7
  start-page: 143
  year: 2004
  ident: 1350_CR24
  publication-title: Econom J
  doi: 10.1111/j.1368-423X.2004.00125.x
– volume-title: The Bayesian choice
  year: 2007
  ident: 1350_CR51
  doi: 10.1007/0-387-71599-1
– volume: 22
  start-page: 1263
  year: 2020
  ident: 1350_CR47
  publication-title: Entropy
  doi: 10.3390/e22111263
– ident: 1350_CR53
– volume: 95
  start-page: 957
  year: 2000
  ident: 1350_CR15
  publication-title: J Am Stat Assoc
  doi: 10.1080/01621459.2000.10474285
– year: 2008
  ident: 1350_CR22
  publication-title: J Stati Softw
  doi: 10.18637/jss.v025.i05
– volume-title: Advances in neural information processing systems
  year: 2015
  ident: 1350_CR7
– volume: 9
  start-page: 2579
  year: 2008
  ident: 1350_CR57
  publication-title: J Mach Learn Res
– volume: 25
  start-page: 713
  year: 2015
  ident: 1350_CR8
  publication-title: Stat Comput
  doi: 10.1007/s11222-015-9561-x
– volume: 2
  start-page: 165
  year: 2015
  ident: 1350_CR60
  publication-title: Ann Data Sci
  doi: 10.1007/s40745-015-0040-1
– volume-title: Handbook of mixed membership models and their applications
  year: 2014
  ident: 1350_CR2
  doi: 10.1201/b17520
– ident: 1350_CR49
– volume-title: Advances in neural information processing systems
  year: 2009
  ident: 1350_CR59
– volume: 25
  start-page: 266
  year: 2016
  ident: 1350_CR37
  publication-title: J Comput Graph Stat
  doi: 10.1080/10618600.2014.983643
– ident: 1350_CR46
– ident: 1350_CR18
– volume-title: Bayesian data analysis
  year: 2013
  ident: 1350_CR26
  doi: 10.1201/b16018
– year: 2020
  ident: 1350_CR35
  publication-title: Electron J Stat
  doi: 10.1214/20-EJS1756
– ident: 1350_CR5
SSID ssj0022721
Score 2.3582969
Snippet In this paper, we formulate a hierarchical Bayesian version of the Mixture of Unigrams model for text clustering and approach its posterior inference through...
SourceID unpaywall
proquest
crossref
springer
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 2015
SubjectTerms Algorithms
Approximation
Clustering
Dirichlet problem
Economic Theory/Quantitative Economics/Mathematical Methods
Inference
Iterative methods
Markov analysis
Markov chains
Mathematics and Statistics
Maxima
Mixtures
Monte Carlo simulation
Original Paper
Probability and Statistics in Computer Science
Probability distribution
Probability Theory and Stochastic Processes
Statistics
SummonAdditionalLinks – databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwNBDA61HrQH8Yn1xRy86eA-Zl8HERWlCBYRFW_LvIqF2lbbov33JtPdrV6K133MLsnM5Msk-QJwLK1vI20FV74RXBjtcSk7CQ90rMl7RszgEmTbcetZ3L1GrzVol7UwlFZZ7oluozYDTWfkZ0GaZglxWPoXww9OXaMoulq20JBFawVz7ijGlmA5IGasOixf3bQfHisXLEhcJRal06HnFAdFGY0rpqP-dB5HG4budRh5PP1rqub4swqZNmBl0h_K6Zfs9X5Zpdt1WCvgJLuc6X8Dara_CY37iot1tAX5C7rDxZEfu5JTO2LErDErWWSDDqNu2C6egOpiuAV29Rsqk7tUQypaxqvv3W-KNIwYQlxGuSJM9yZEsYA_uA3PtzdP1y1etFXgOozDMRdUJJVZEWmUpghVIIxCUGeVtLGNUmWyjhDKyNCkUWSVDkQkEAahGUuECXQa7kC9P-jbXWBJgvhGWE9L4eGwnYyWf5aiVVS-QqjRBL-UYK4LznFqfdHLK7ZkJ_UcpZ47qedpE06qd4Yzxo2FTx-UismL1TfK53OlCaelsua3F412Win0Hx_fW_zxfVil5vSz5JcDqI8_J_YQIcxYHRXz8geOR-pv
  priority: 102
  providerName: ProQuest
– databaseName: SpringerLink Journals (ICM)
  dbid: U2A
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEB60HmwPolWxWmUP3uxCHrt5HKtYiqAnK72F3c0GCzEtpkX7753NqwpS9JrHJOTb3fkmO_MNwLXQtuZKMyrtmFEWK4sKkfjUUZ4y0TNyhiJB9skbT9jDlE-rorC8znavtySLlbopdjP94yyKPgbDX5dbNNiFPW7kvHAUT5xhE2Y5flFtZVLmMDrynKpU5ncbP93RhmM226Id2F9lC7H-EGn6zfOMDuGgooxkWGJ8BDs660LnsdFbzbvQNpyxlFw-hugF49_qHx-5FWudEyOlUdYoknlCTPvrYgMB8SG45s3UK6JHi9xCU6WMR99mn2ZrISfIaYlJDiEqXRlNBXzbE5iM7p_vxrTqo0CV67lLykxVVKgZV0j3mCsdFktkcVoK7WkeyDhMGJOxcOOAcy2VwzhD3oN-y2exowL3FFrZPNNnQHwfCQ3TlhLMQrNJaOZ7GKAblLZEbtEDu_6ckapExk2vizRq5JELCCKEICogiIIe3DT3LEqJja1X92uUomq65ZETBKFvBE7tHgxq5Dant1kbNOj-4eHn_7N-AW3Tnb7MfulDa_m-0pfIYZbyqhiyX_sF5OE
  priority: 102
  providerName: Springer Nature
Title Variational Bayes estimation of hierarchical Dirichlet-multinomial mixtures for text clustering
URI https://link.springer.com/article/10.1007/s00180-023-01350-8
https://www.proquest.com/docview/2889791391
https://hdl.handle.net/11586/429445
UnpaywallVersion submittedVersion
Volume 38
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: EBSCOhost Mathematics Source - trial do 30.11.2025
  customDbUrl:
  eissn: 1613-9658
  dateEnd: 20241101
  omitProxy: false
  ssIdentifier: ssj0022721
  issn: 0943-4062
  databaseCode: AMVHM
  dateStart: 20110301
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/mathematics-source
  providerName: EBSCOhost
– providerCode: PRVLSH
  databaseName: SpringerLink Journals
  customDbUrl:
  mediaType: online
  eissn: 1613-9658
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0022721
  issn: 0943-4062
  databaseCode: AFBBN
  dateStart: 19990301
  isFulltext: true
  providerName: Library Specific Holdings
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 1613-9658
  dateEnd: 20241101
  omitProxy: true
  ssIdentifier: ssj0022721
  issn: 0943-4062
  databaseCode: BENPR
  dateStart: 19990301
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Technology Collection
  customDbUrl:
  eissn: 1613-9658
  dateEnd: 20241101
  omitProxy: true
  ssIdentifier: ssj0022721
  issn: 0943-4062
  databaseCode: 8FG
  dateStart: 19990301
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/technologycollection1
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: SpringerLINK - Czech Republic Consortium
  customDbUrl:
  eissn: 1613-9658
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0022721
  issn: 0943-4062
  databaseCode: AGYKE
  dateStart: 19990101
  isFulltext: true
  titleUrlDefault: http://link.springer.com
  providerName: Springer Nature
– providerCode: PRVAVX
  databaseName: SpringerLink Journals (ICM)
  customDbUrl:
  eissn: 1613-9658
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0022721
  issn: 0943-4062
  databaseCode: U2A
  dateStart: 20040212
  isFulltext: true
  titleUrlDefault: http://www.springerlink.com/journals/
  providerName: Springer Nature
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dT9swED9B-4B44GNsWqeussTehiEf58R5LKylArVCE0XwFNmOqyFKqWirAX895zQJjAe2vcRS7Dixzvb9Lr77HcA3ZX0rjEWu_Qw5ZsbjSo1iHpjIOOuZMEPuIDuIekM8uRSXK7BbxsK84RcgvCKjA9ozEcUq1CNBgLsG9eHgrH2Vs-hhSBZQnjaUoEvIHZVJERqTB8i5nHMeJ71EJnMoPC7_VD8vmLI6Bl2HtcVkqh5_q_H4labpbsKP8huXDiY3-4u53jdPb-gb_zKILdgokCZrL6fGNqzYyQdY71c0rbMdSC_IUi7-BrJD9WhnzJFuLKMZ2d2IuUTZ-VEDSZLR7nhtfpGcee6F6OKZ6e7t9YM7hJgxQr_MuZEwM1449gUa50cYdjvnRz1eZFzgJozCOUcXP5VYFIaAIYY6wEwT3rNa2cgKqbNkhKgzFWZSCKtNgAIJIZGGizELjAw_QW1yN7GfgcUxQR-0nlHoUbejxO0MiSSFqX1NKKQBfimI1BR05C4rxjitiJRz4aUkvDQXXiob8L16Zrok43i3dbOUb1oszFkaSJnEjgrVb8BeKfOX6vd626vmxT-8_Mv_NW9CbX6_sF8J4Mx1C1blke-u3eMW1NvHV6cdV_Yven0qDzuDs59UOwzarWIpPAMzePm0
linkProvider Unpaywall
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxEB6V9lB6QDxFoIAPcKIWu17v61AhCq1S2kYItag341dEpZAENlHJn-O3MeN4N3CJuPS6j9nVzHj8jecF8FL71OfWS25SJ7l0NuFaD0subGHJe0bMEBJkB0X_Qn68zC834HdbC0Npla1NDIbaTSydkb8RVVWX1MMyfTv9wWlqFEVX2xEaOo5WcPuhxVgs7Djxi2t04Zr94w8o71dCHB2ev-_zOGWA26zIZlxSzVDtZW6RuMyMkM4gxvFG-8LnlXH1UErjdOaqPPfGCplLRAVo1UvphK0ypHsLtmQma3T-tg4OB58-dy6fKEPlF6XvoadWiFi2E4r3aB5ewnHPRHc-yxNe_bs1rvBuF6Ldge35eKoX13o0-msXPLoLdyJ8Ze-W-nYPNvz4Puycdb1fmwegvqD7HY8Y2YFe-IZRJ49liSSbDBlN3w7xC1QPhib3yn5D5eEhtZGKpPHq96tfFNloGEJqRrkpzI7m1NIBf_AhXNwIgx_B5ngy9o-BlSXiKekTq2WCZIc1mZu6wl3YpAahTQ_SloPKxh7nNGpjpLruzIHrCrmuAtdV1YPX3TvTZYePtU_vtoJRcbU3aqWbPdhrhbW6vY7aXifQ__j4k_UffwHb_fOzU3V6PDh5CrcFqVZIvNmFzdnPuX-G8GlmnkcdZfD1ppfFH4wRJo4
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VIgE9VLwqlrbgA5yo1cRxEueAqj5YtRQqDhT1ZmzHEZWW3W2zq7J_rb-OGW-SLZcVl17zmEQz4_E3nhfAO-NjnzovuY1LyWXpIm5MlXPhMkfeM2KGkCB7lh2fy88X6cUK3La1MJRW2drEYKjLkaMz8l2hVJFTD8t4t2rSIr4d9ffGV5wmSFGktR2nMVeRUz-7Qfet_nhyhLJ-L0T_0_fDY95MGOAuyZIJl1QvVHiZOiQsEytkaRHfeGt85lNly6KS0pYmKVWaeuuETCUiArTouSyFUwnSfQAP8yTLqG-_OuzSS4TIQ80XJe6hj5aJpmAnlO3RJLyI426JjnySRlz9uykukG4XnF2Dx9Ph2MxuzGBwZ__rP4X1Briy_bmmPYMVP3wOa1-7rq_1C9A_0PFuDhfZgZn5mlEPj3lxJBtVjOZuh8gFKgZDY3vpfqHa8JDUSOXRePX35R-KadQMwTQj9jM3mFIzB_zBl3B-L-zdgNXhaOhfActzRFLSR87ICMlWBRmaQuH-a2OLoKYHcctB7Zru5jRkY6C7vsyB6xq5rgPXterBh-6d8by3x9Knt1rB6Gad13qhlT3YaYW1uL2M2k4n0P_4-OvlH38Lj3Ax6C8nZ6eb8ESQZoWMmy1YnVxP_Tbipol9ExSUwc_7XhF_AQPLI_M
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwEB1Be0A9sCOKAFmCGxiyjBPnyCqEBOJAUTlFtuOKilIQbcXy9YzTJCyHAtfEcWI9x_PGnnkDsK2sb4WxyLWfIcfMeFypTswDExnnPRNnyANkL6OzFp63RXsKtspcmB_6AsRXZLRPayaimIZ6JIhw16Deurw6uM1V9DAkDygvG0rUJeROyqRIjckT5FzNOY-TXSKXORQel9_NzyenrI5BGzAz6j-ptxfV632xNKdzcFx-4zjA5H5vNNR75v2HfOMvg5iH2YJpsoPx1FiAKdtfhMZFJdM6WIL0hjzlYjeQHao3O2BOdGOczcgeO8wVys6PGghJRqtj19wRzjyPQnT5zHT1ofvqDiEGjNgvc2EkzPRGTn2BxrkMrdOT66MzXlRc4CaMwiFHlz-VWBSGiCGGOsBME9-zWtnICqmzpIOoMxVmUgirTYACiSGRhYsxC4wMV6DWf-zbVWBxTNQHrWcUetRtJ3ErQyLJYGpfEwtpgl8CkZpCjtxVxeillZByDl5K4KU5eKlswk71zNNYjGNi6_US37T4MQdpIGUSOylUvwm7Jeaftyf1tlvNiz-8fO1_zdehNnwe2Q0iOEO9WUzwD-qQ8k4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Variational+Bayes+estimation+of+hierarchical+Dirichlet-multinomial+mixtures+for+text+clustering&rft.jtitle=Computational+statistics&rft.au=Bilancia%2C+Massimo&rft.au=Di+Nanni%2C+Michele&rft.au=Manca%2C+Fabio&rft.au=Pio%2C+Gianvito&rft.date=2023-12-01&rft.issn=0943-4062&rft.eissn=1613-9658&rft.volume=38&rft.issue=4&rft.spage=2015&rft.epage=2051&rft_id=info:doi/10.1007%2Fs00180-023-01350-8&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s00180_023_01350_8
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0943-4062&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0943-4062&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0943-4062&client=summon