An Approach for Analyzing Unstructured Text Data Using Topic Modeling Techniques for Efficient Information Extraction

Topic modeling techniques are popularly used for document clustering, large-scale text analysis, information extraction from unstructured text documents, feature selection from large corpus, and various recommendation systems. This work suggested a framework using topic modeling techniques for legal...

Full description

Saved in:
Bibliographic Details
Published inNew generation computing Vol. 42; no. 1; pp. 109 - 134
Main Authors Zadgaonkar, Ashwini, Agrawal, Avinash J.
Format Journal Article
LanguageEnglish
Published Tokyo Springer Japan 01.03.2024
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0288-3635
1882-7055
DOI10.1007/s00354-023-00230-5

Cover

More Information
Summary:Topic modeling techniques are popularly used for document clustering, large-scale text analysis, information extraction from unstructured text documents, feature selection from large corpus, and various recommendation systems. This work suggested a framework using topic modeling techniques for legal information extraction from the Indian judicial system’s unstructured legal judgments. The suggested approach aims to eliminate time-consuming manual judgment analysis in favor of automated judgment analysis that can quickly examine large number of judgments in reduced time span. In this work, we have experimented with different topic modeling methodologies for information extraction. The proposed framework is built on the Latent Dirichlet Allocation, to categorize legal judgments into extracted topic groups. Indian Supreme Court judgements are considered for the experimental setting. The three main elements of the framework are pre-processing , applying the topic model , and model evaluation using a coherence score metric. The framework was successfully applied to a corpus size of 100, 500, and 1000 legal judgments in batches. The proposed framework is used to measure legal judgment similarity to demonstrate its quantitative evaluation. In the future scope, various legal tasks that can benefit from the proposed framework for performance improvement are suggested.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0288-3635
1882-7055
DOI:10.1007/s00354-023-00230-5