Information Retrieval in Financial Documents

Sumithra, M KaniSridhar, RajeswariThe improvement in the computation power and decreasing cost of storage has resulted in the exponential growth of day-to-day data which can be processed and information can be retrieved to gain insights and knowledge. There are some information retrieval models like...

Full description

Saved in:
Bibliographic Details
Published inEvolving Technologies for Computing, Communication and Smart World Vol. 694; pp. 265 - 274
Main Authors Sumithra, M. Kani, Sridhar, Rajeswari
Format Book Chapter
LanguageEnglish
Published Singapore Springer 2020
Springer Singapore
SeriesLecture Notes in Electrical Engineering
Subjects
Online AccessGet full text
ISBN9789811578038
9811578036
ISSN1876-1100
1876-1119
DOI10.1007/978-981-15-7804-5_20

Cover

More Information
Summary:Sumithra, M KaniSridhar, RajeswariThe improvement in the computation power and decreasing cost of storage has resulted in the exponential growth of day-to-day data which can be processed and information can be retrieved to gain insights and knowledge. There are some information retrieval models like Boolean, vector, and probabilistic models that help achieve this target. Using these models leads to problems such as documentary silence and documentary noise due to approximate, poor, and partial representation of the semantic content of documents. In this paper, we built a system that constructs knowledge graphs from the unstructured data and later queries the graph to retrieve information. A knowledge graph is the type of knowledge representation consisting of a collection of entities, events, real-world objects, or it can be any abstract concepts and they are linked together using some relations. They are preferred as they contain large volumes of factual information with less formal semantics. Our approach is to preprocess the structured data followed by named entity recognition with appropriate finance-related tags. Entity relation formulator extracts entities and matches them to relations forming a triple of a subject, object, and predicate. These set of triples can be queried by a natural language query language and converting it into a knowledge graph query. The evaluation metrics employed in this paper are accuracy, precision, and recall. The system has an accuracy of 0.822, a precision of 0.837, and a recall of 0.9015 for a set of 500 questions and answers.
ISBN:9789811578038
9811578036
ISSN:1876-1100
1876-1119
DOI:10.1007/978-981-15-7804-5_20