A note on phase transitions and computational pitfalls of learning from sequences

An ever greater range of applications call for learning from sequences. Grammar induction is one prominent tool for sequence learning, it is therefore important to know its properties and limits. This paper presents a new type of analysis for inductive learning. A few years ago, the discovery of a p...

Full description

Saved in:

Bibliographic Details
Published in	Journal of intelligent information systems Vol. 31; no. 2; pp. 177 - 189
Main Authors	Cornuéjols, Antoine, Sebag, Michèle
Format	Journal Article
Language	English
Published	Boston Springer US 01.10.2008 Springer Nature B.V
Subjects	Algorithms Artificial Intelligence Bioinformatics Computer Science Data Structures and Information Theory Grammar Heuristic Hypotheses Information Storage and Retrieval IT in Business Language Logic programming Natural Language Processing (NLP) Phase transitions Studies Supervised learning Sequence learning Grammatical inference
Online Access	Get full text
ISSN	0925-9902 1573-7675 1573-7675
DOI	10.1007/s10844-008-0063-6

Cover

More Information
Summary:	An ever greater range of applications call for learning from sequences. Grammar induction is one prominent tool for sequence learning, it is therefore important to know its properties and limits. This paper presents a new type of analysis for inductive learning. A few years ago, the discovery of a phase transition phenomenon in inductive logic programming proved that fundamental characteristics of the learning problems may affect the very possibility of learning under very general conditions. We show that, in the case of grammatical inference, while there is no phase transition when considering the whole hypothesis space, there is a much more severe “gap” phenomenon affecting the effective search space of standard grammatical induction algorithms for deterministic finite automata (DFA). Focusing on standard search heuristics, we show that they overcome this difficulty to some extent, but that they are subject to overgeneralization. The paper last suggests some directions to alleviate this problem.
Bibliography:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23
ISSN:	0925-9902 1573-7675 1573-7675
DOI:	10.1007/s10844-008-0063-6