Predicting the Political Sentiment of Web Log Posts Using Supervised Machine Learning Techniques Coupled with Feature Selection

As the number of web logs dramatically grows, readers are turning to them as an important source of information. Automatic techniques that identify the political sentiment of web log posts will help bloggers categorize and filter this exploding information source. In this paper we illustrate the eff...

Full description

Saved in:
Bibliographic Details
Published inAdvances in Web Mining and Web Usage Analysis Vol. 4811; pp. 187 - 206
Main Authors Durant, Kathleen T., Smith, Michael D.
Format Book Chapter
LanguageEnglish
Published Germany Springer Berlin / Heidelberg 2007
Springer Berlin Heidelberg
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN354077484X
9783540774846
ISSN0302-9743
1611-3349
DOI10.1007/978-3-540-77485-3_11

Cover

More Information
Summary:As the number of web logs dramatically grows, readers are turning to them as an important source of information. Automatic techniques that identify the political sentiment of web log posts will help bloggers categorize and filter this exploding information source. In this paper we illustrate the effectiveness of supervised learning for sentiment classification on web log posts. We show that a Naïve Bayes classifier coupled with a forward feature selection technique can on average correctly predict a posting’s sentiment 89.77% of the time with a standard deviation of 3.01. It significantly outperforms Support Vector Machines at the 95% confidence level with a confidence interval of [1.5, 2.7]. The feature selection technique provides on average an 11.84% and a 12.18% increase for Naïve Bayes and Support Vector Machines results respectively. Previous sentiment classification research achieved an 81% accuracy using Naïve Bayes and 82.9% using SVMs on a movie domain corpus.
ISBN:354077484X
9783540774846
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-540-77485-3_11