Predicting the Political Sentiment of Web Log Posts Using Supervised Machine Learning Techniques Coupled with Feature Selection

As the number of web logs dramatically grows, readers are turning to them as an important source of information. Automatic techniques that identify the political sentiment of web log posts will help bloggers categorize and filter this exploding information source. In this paper we illustrate the eff...

Full description

Saved in:

Bibliographic Details
Published in	Advances in Web Mining and Web Usage Analysis Vol. 4811; pp. 187 - 206
Main Authors	Durant, Kathleen T., Smith, Michael D.
Format	Book Chapter
Language	English
Published	Germany Springer Berlin / Heidelberg 2007 Springer Berlin Heidelberg
Series	Lecture Notes in Computer Science
Subjects	Artificial intelligence Blogs feature selection Naïve Bayes Network hardware Sentiment Classification Support Vector Machines Web Logs WEKA
Online Access	Get full text
ISBN	354077484X 9783540774846
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-540-77485-3_11

Cover

More Information
Summary:	As the number of web logs dramatically grows, readers are turning to them as an important source of information. Automatic techniques that identify the political sentiment of web log posts will help bloggers categorize and filter this exploding information source. In this paper we illustrate the effectiveness of supervised learning for sentiment classification on web log posts. We show that a Naïve Bayes classifier coupled with a forward feature selection technique can on average correctly predict a posting’s sentiment 89.77% of the time with a standard deviation of 3.01. It significantly outperforms Support Vector Machines at the 95% confidence level with a confidence interval of [1.5, 2.7]. The feature selection technique provides on average an 11.84% and a 12.18% increase for Naïve Bayes and Support Vector Machines results respectively. Previous sentiment classification research achieved an 81% accuracy using Naïve Bayes and 82.9% using SVMs on a movie domain corpus.
ISBN:	354077484X 9783540774846
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-540-77485-3_11