A summarizer system based on a semantic analysis of web documents

The availability of web and search engines has made the search easier nowadays. Information overload is one of the major problems which require algorithms and tools for faster access. Electronic documents are one of the major sources of information for business and academic information. In order to...

Full description

Saved in:
Bibliographic Details
Published in2015 International Conference on Technologies for Sustainable Development (ICTSD) pp. 1 - 6
Main Authors Florence, Angelin, Padmadas, Vijaya
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.02.2015
Subjects
Online AccessGet full text
DOI10.1109/ICTSD.2015.7095851

Cover

More Information
Summary:The availability of web and search engines has made the search easier nowadays. Information overload is one of the major problems which require algorithms and tools for faster access. Electronic documents are one of the major sources of information for business and academic information. In order to fully utilizing these on-line documents effectively, it is crucial to be able to extract the summary of these documents. Summarization system will be one of the solutions to the above problem. This project proposes a summarizer system which will be able to perform summarization of multiple documents. The input text documents are analyzed through a parser which parses the input documents and generates parse tree for each sentence. RDF triples are extracted from each sentence by analyzing the typed dependencies generated from the parser in the form of subject, verb and object. Semantic distance is computed between each pair of sentences and a matrix containing the semantic distance for sentences are computed. The measure adopted to compute semantic distance is Wu and Palmer distance. A clustering algorithm is applied to the extracted subject, verb and object space and the extracted RDF triples are grouped into clusters. The important sentences are selected for final summary are extracted using sentence selection algorithm.
DOI:10.1109/ICTSD.2015.7095851