FamilyID: A Hybrid Approach to Identify Family Information from Microblogs

With the growing popularity of social networks, extremely large amount of users routinely post messages about their daily life to online social networking services. In particular, we have observed that family related information, including some very sensitive information, are freely available and ea...

Full description

Saved in:
Bibliographic Details
Published inData and Applications Security and Privacy XXIX Vol. 9149; pp. 215 - 222
Main Authors Gopal, Jamuna, Huang, Shu, Luo, Bo
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2015
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN3319208098
9783319208091
ISSN0302-9743
1611-3349
1611-3349
DOI10.1007/978-3-319-20810-7_14

Cover

More Information
Summary:With the growing popularity of social networks, extremely large amount of users routinely post messages about their daily life to online social networking services. In particular, we have observed that family related information, including some very sensitive information, are freely available and easily extracted from Twitter. In this paper, we present a hybrid information retrieval mechanism, namely FamilyID, to identify and extract family related information of a user from his/her microblogs (tweets). The proposed model takes into account part-of-speech tagging, pattern matching, lexical similarity, and semantic similarity of the tweets. Experiment results show that FamilyID provides both high precision and recall. We expect the project to serve as a warning to users that they may have accidentally revealed too much personal/family information to the public. It could also help microblog users to evaluate the amount of information that they have already revealed.
Bibliography:S. Huang and B. Luo—This work was partially supported by NSF CNS-1422206, NSF IIS-1513324, NSF OIA-1308762, and University of Kansas GRF-2301876.
ISBN:3319208098
9783319208091
ISSN:0302-9743
1611-3349
1611-3349
DOI:10.1007/978-3-319-20810-7_14