Detecting nuance in conspiracy discourse: Advancing methods in infodemiology and communication science with machine learning and qualitative content coding

The spread of misinformation and conspiracies has been an ongoing issue since the early stages of the internet era, resulting in the emergence of the field of infodemiology (i.e., information epidemiology), which investigates the transmission of health-related information. Due to the high volume of...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 18; no. 12; p. e0295414
Main Authors Haupt, Michael Robert, Chiu, Michelle, Chang, Joseline, Li, Zoe, Cuomo, Raphael, Mackey, Tim K.
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 20.12.2023
Subjects
Online AccessGet full text
ISSN1932-6203
1932-6203
DOI10.1371/journal.pone.0295414

Cover

More Information
Summary:The spread of misinformation and conspiracies has been an ongoing issue since the early stages of the internet era, resulting in the emergence of the field of infodemiology (i.e., information epidemiology), which investigates the transmission of health-related information. Due to the high volume of online misinformation in recent years, there is a need to continue advancing methodologies in order to effectively identify narratives and themes. While machine learning models can be used to detect misinformation and conspiracies, these models are limited in their generalizability to other datasets and misinformation phenomenon, and are often unable to detect implicit meanings in text that require contextual knowledge. To rapidly detect evolving conspiracist narratives within high volume online discourse while identifying nuanced themes requiring the comprehension of subtext, this study describes a hybrid methodology that combines natural language processing (i.e., topic modeling and sentiment analysis) with qualitative content coding approaches to characterize conspiracy discourse related to 5G wireless technology and COVID-19 on Twitter (currently known as ‘X’). Discourse that focused on correcting 5G conspiracies was also analyzed for comparison. Sentiment analysis shows that conspiracy-related discourse was more likely to use language that was analytic, combative, past-oriented, referenced social status, and expressed negative emotions. Corrections discourse was more likely to use words reflecting cognitive processes, prosocial relations, health-related consequences, and future-oriented language. Inductive coding characterized conspiracist narratives related to global elites, anti-vax sentiment, medical authorities, religious figures, and false correlations between technology advancements and disease outbreaks. Further, the corrections discourse did not address many of the narratives prevalent in conspiracy conversations. This paper aims to further bridge the gap between computational and qualitative methodologies by demonstrating how both approaches can be used in tandem to emphasize the positive aspects of each methodology while minimizing their respective drawbacks.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: Authors TKM and ZL are employees of the startup company S-3 Research LLC. S-3 Research is a startup funded and currently supported by the National Institutes of Health – National Institute of Drug Abuse through a Small Business Innovation and Research contract for opioid-related social media research and technology commercialization. TKM is also the CEO and a member of S-3 Research LLC with ownership. Author reports no other conflict of interest associated with this manuscript. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0295414