Detecting Text Similarity Using MapReduce Framework

The evaluation of similarities between textual documents was regarded as a subject of research strongly recommended in various domains. There are many of documents in a large amount of corpus. Most of them are required to check the similarity for validation. In this paper, we propose a new MapReduce...

Full description

Saved in:
Bibliographic Details
Published inEurope and MENA Cooperation Advances in Information and Communication Technologies Vol. 520; pp. 383 - 389
Main Authors Birjali, Marouane, Beni-Hssane, Abderrahim, Erritali, Mohammed, Madani, Youness
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2016
Springer International Publishing
SeriesAdvances in Intelligent Systems and Computing
Subjects
Online AccessGet full text
ISBN3319465678
9783319465678
ISSN2194-5357
2194-5365
DOI10.1007/978-3-319-46568-5_39

Cover

More Information
Summary:The evaluation of similarities between textual documents was regarded as a subject of research strongly recommended in various domains. There are many of documents in a large amount of corpus. Most of them are required to check the similarity for validation. In this paper, we propose a new MapReduce algorithm of document similarity measures. Then we study the state of the art of different approaches for computing the similarity of amount documents to choose the approach that will be used in our MapReduce algorithm. Therefore, we present how the similarity between terms is used in the assessment of the similarity between documents. Simulation results, on Hadoop framework, show that our MapReduce algorithm outperforms classical ones in term of running time.
ISBN:3319465678
9783319465678
ISSN:2194-5357
2194-5365
DOI:10.1007/978-3-319-46568-5_39