N-Folded Parallel String Matching Mechanism

A massive requirement of information vitalized the importance of managing enormous amount of data. It becomes a herculean task to fetch the anticipated data from large data storage as it includes text processing, text mining, pattern recognition, data cleaning etc., The need for concurrent events an...

Full description

Saved in:
Bibliographic Details
Published inAnnals of data science Vol. 3; no. 4; pp. 339 - 384
Main Authors Katari, Butchi Raju, Viswanadha Raju, S.
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.12.2016
Subjects
Online AccessGet full text
ISSN2198-5804
2198-5812
DOI10.1007/s40745-016-0086-8

Cover

More Information
Summary:A massive requirement of information vitalized the importance of managing enormous amount of data. It becomes a herculean task to fetch the anticipated data from large data storage as it includes text processing, text mining, pattern recognition, data cleaning etc., The need for concurrent events and coming up with high performance processing models to extract data is a challenge to the researchers. One of the solutions to this challenge is concurrent process to match string on processing models. While, some of the mechanisms do perform very well in practice. Frequent works have been published on this subject and research is still active in this area as the scope and opportunities to develop the new techniques is perennial. This paper proposes N-folded parallel string matching mechanism. This mechanism would be able to divide the input sequence files into various parts and the same would be distributed to the processors. Considering this mechanism as a model, experiments have been conducted considering chloroplast, mitochondria and different categories of plants genome sequence file as input for different sizes with seven possible patterns. The results of the experiment made evident that N-folded parallel string matching mechanism can reduce the processing time on a multi processor system.
ISSN:2198-5804
2198-5812
DOI:10.1007/s40745-016-0086-8