Finding Core Topics: Topic Extraction with Clustering on Tweet

Twitter is one of the most popular microblogging services that lets users post short text called Tweet. Tweet is distinguished from conventional text data in that it is typically composed of short and informal message, and it makes typical text analysis methods do not work well. Accordingly, extract...

Full description

Saved in:
Bibliographic Details
Published in2012 International Conference on Cloud and Green Computing pp. 777 - 782
Main Authors Sungchul Kim, Sungho Jeon, Jinha Kim, Young-Ho Park, Hwanjo Yu
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2012
Subjects
Online AccessGet full text
ISBN1467330272
9781467330275
DOI10.1109/CGC.2012.120

Cover

More Information
Summary:Twitter is one of the most popular microblogging services that lets users post short text called Tweet. Tweet is distinguished from conventional text data in that it is typically composed of short and informal message, and it makes typical text analysis methods do not work well. Accordingly, extracting meaningful topics from tweets brings up new challenges. In this work, we propose a simple and novel method called Core-Topic-based Clustering (CTC), which extracts topics and cluster tweets simultaneously based on the clustering principles: minimizing the inter-cluster similarity and maximizing the intra-cluster similarity. Experimental results show that our method efficiently extracts meaningful topics, and the clustering performance is better than K-means algorithm.
ISBN:1467330272
9781467330275
DOI:10.1109/CGC.2012.120