Grammar induction using bit masking oriented genetic algorithm and comparative analysis

•A background on theory of grammar induction is presented.•The effect of premature convergence is discussed in detail.•Proposed a system for grammar inference by utilizing the mask-fill reproduction operators and Boolean based procedure with minimum description length principle.•Comparative analysis...

Full description

Saved in:
Bibliographic Details
Published inApplied soft computing Vol. 38; pp. 453 - 468
Main Authors Pandey, Hari Mohan, Chaudhary, Ankit, Mehrotra, Deepti
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.01.2016
Subjects
Online AccessGet full text
ISSN1568-4946
1872-9681
DOI10.1016/j.asoc.2015.09.044

Cover

More Information
Summary:•A background on theory of grammar induction is presented.•The effect of premature convergence is discussed in detail.•Proposed a system for grammar inference by utilizing the mask-fill reproduction operators and Boolean based procedure with minimum description length principle.•Comparative analysis, discussion and observation of obtained results are given in an effective manner.•Statistical tests (F-test and post hoc test) are conducted. This paper presents bit masking oriented genetic algorithm (BMOGA) for context free grammar induction. It takes the advantages of crossover and mutation mask-fill operators together with a Boolean based procedure in two phases to guide the search process from ith generation to (i+1)th generation. Crossover and mutation mask-fill operations are performed to generate the proportionate amount of population in each generation. A parser has been implemented checks the validity of the grammar rules based on the acceptance or rejection of training data on the positive and negative strings of the language. Experiments are conducted on collection of context free and regular languages. Minimum description length principle has been used to generate a corpus of positive and negative samples as appropriate for the experiment. It was observed that the BMOGA produces successive generations of individuals, computes their fitness at each step and chooses the best when reached to threshold (termination) condition. As presented approach was found effective in handling premature convergence therefore results are compared with the approaches used to alleviate premature convergence. The analysis showed that the BMOGA performs better as compared to other algorithms such as: random offspring generation approach, dynamic allocation of reproduction operators, elite mating pool approach and the simple genetic algorithm. The term success ratio is used as a quality measure and its value shows the effectiveness of the BMOGA. Statistical tests indicate superiority of the BMOGA over other existing approaches implemented.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2015.09.044