Syntax-Based Collocation Extraction

Collocation is a key language phenomenon which crucially impacts any text production task and which is exploitable in many text analysis tasks. This book offers a comprehensive and up-to-date review of the theoretical and practical work on this topic.

Saved in:
Bibliographic Details
Main Author Seretan, Violeta
Format eBook Book
LanguageEnglish
Published Dordrecht Springer Nature 2011
Springer
Springer Netherlands
Edition1
SeriesText, Speech and Language Technology
Subjects
Online AccessGet full text
ISBN9400701349
9789400701342
9400701330
9789400701335
ISSN1386-291X
DOI10.1007/978-94-007-0134-2

Cover

Table of Contents:
  • Intro -- Preface -- Contents -- 1 Introduction -- 1.1 Collocations and Their Relevance for NLP -- 1.2 The Need for Syntax-Based Collocation Extraction -- 1.3 Aims -- 1.4 Chapters Outline -- 2 On Collocations -- 2.1 Introduction -- 2.2 A Survey of Definitions -- 2.2.1 Statistical Approaches -- 2.2.2 Linguistic Approaches -- 2.2.3 Collocation vs. Co-occurrence -- 2.3 Towards a Core Collocation Concept -- 2.4 Theoretical Perspectives on Collocations -- 2.4.1 Contextualism -- 2.4.2 Text Cohesion -- 2.4.3 Meaning-Text Theory -- 2.4.4 Semantics and Metaphoricity -- 2.4.5 Lexis-Grammar Interface -- 2.5 Linguistic Descriptions -- 2.5.1 Semantic Compositionality -- 2.5.2 Morpho-Syntactic Characterisation -- 2.6 What Collocation Means in This Book -- 2.7 Summary -- 3 Survey of Extraction Methods -- 3.1 Introduction -- 3.2 Extraction Techniques -- 3.2.1 Collocation Features Modelled -- 3.2.2 General Extraction Architecture -- 3.2.3 Contingency Tables -- 3.2.4 Association Measures -- 3.2.5 Criteria for the Application of Association Measures -- 3.3 Linguistic Preprocessing -- 3.3.1 Lemmatization -- 3.3.2 POS Tagging -- 3.3.3 Shallow and Deep Parsing -- 3.3.4 Beyond Parsing -- 3.4 Survey of the State of the Art -- 3.4.1 English -- 3.4.2 German -- 3.4.3 French -- 3.4.4 Other Languages -- 3.5 Summary -- 4 Syntax-Based Extraction -- 4.1 Introduction -- 4.2 The Fips Multilingual Parser -- 4.3 Extraction Method -- 4.3.1 Candidate Identification -- 4.3.2 Candidate Ranking -- 4.4 Evaluation -- 4.4.1 On Collocation Extraction Evaluation -- 4.4.2 Evaluation Method -- 4.4.3 Experiment 1: Monolingual Evaluation -- 4.4.4 Results of Experiment 1 -- 4.4.5 Experiment 2: Cross-Lingual Evaluation -- 4.4.6 Results of Experiment 2 -- 4.5 Qualitative Analysis -- 4.5.1 Error Analysis
  • 4.5.2 Intersection and Rank Correlation -- 4.5.3 Instance-Level Analysis -- 4.6 Discussion -- 4.7 Summary -- 5 Extensions -- 5.1 Identification of Complex Collocations -- 5.1.1 The Method -- 5.1.2 Experimental Results -- 5.1.3 Related Work -- 5.2 Data-Driven Induction of Syntactic Patterns -- 5.2.1 The Method -- 5.2.2 Experimental Results -- 5.2.3 Related Work -- 5.3 Corpus-Based Collocation Translation -- 5.3.1 The Method -- 5.3.2 Experimental Results -- 5.3.3 Related Work -- 5.4 Summary -- 6 Conclusion -- 6.1 Main Contributions -- 6.2 Future Directions -- A List of Collocation Dictionaries -- English -- French -- Italian -- Polish -- Portugese -- Russian -- Spanish -- B List of Collocation Definitions -- C Association Measures -- Mathematical Notes -- C.1 X2 -- C.2 Log-Likelihood Ratio -- D Monolingual Evaluation (Experiment 1) -- D.1 Test Data and Annotations -- D.2 Results -- E Cross-Lingual Evaluation (Experiment 2) -- E.1 Test Data and Annotations -- E.2 Results -- F Output Comparison -- References -- Index