Who are the 100 largest scientific publishers by journal count? A webscraping approach

PurposeHow to obtain a list of the 100 largest scientific publishers sorted by journal count? Existing databases are unhelpful as each of them inhere biased omissions and data quality flaws. This paper tries to fill this gap with an alternative approach.Design/methodology/approachThe content coverag...

Full description

Saved in:
Bibliographic Details
Published inJournal of documentation Vol. 78; no. 7; pp. 450 - 463
Main Author Nishikawa-Pacher, Andreas
Format Journal Article
LanguageEnglish
Published Bradford Emerald Publishing Limited 19.12.2022
Emerald Group Publishing Limited
Subjects
Online AccessGet full text
ISSN0022-0418
1758-7379
DOI10.1108/JD-04-2022-0083

Cover

More Information
Summary:PurposeHow to obtain a list of the 100 largest scientific publishers sorted by journal count? Existing databases are unhelpful as each of them inhere biased omissions and data quality flaws. This paper tries to fill this gap with an alternative approach.Design/methodology/approachThe content coverages of Scopus, Publons, DOAJ and SherpaRomeo were first used to extract a preliminary list of publishers that supposedly possess at least 15 journals. Second, the publishers' websites were scraped to fetch their portfolios and, thus, their “true” journal counts.FindingsThe outcome is a list of the 100 largest publishers comprising 28.060 scholarly journals, with the largest publishing 3.763 journals, and the smallest carrying 76 titles. The usual “oligopoly” of major publishing companies leads the list, but it also contains 17 university presses from the Global South, and, surprisingly, 30 predatory publishers that together publish 4.517 journals.Research limitations/implicationsAdditional data sources could be used to mitigate remaining biases; it is difficult to disambiguate publisher names and their imprints; and the dataset carries a non-uniform distribution, thus risking the omission of data points in the lower range.Practical implicationsThe dataset can serve as a useful basis for comprehensive meta-scientific surveys on the publisher-level.Originality/valueThe catalogue can be deemed more inclusive and diverse than other ones because many of the publishers would have been overlooked if one had drawn from merely one or two sources. The list is freely accessible and invites regular updates. The approach used here (webscraping) has seldomly been used in meta-scientific surveys.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0022-0418
1758-7379
DOI:10.1108/JD-04-2022-0083