Genres on the Web Computational Models and Empirical Studies
The description, automatic identification and further processing of web genres is a novel field of research in computational linguistics, NLP and related areas such as text-technology, digital humanities and web mining. One of the driving forces behind this research is the idea of genre-enabled sear...
Saved in:
Main Authors | , , |
---|---|
Format | eBook Book |
Language | English |
Published |
Dordrecht
Springer Nature
2010
Springer Science + Business Media Springer Springer Netherlands |
Edition | 1. Aufl. |
Series | Text, Speech and Language Technology |
Subjects | |
Online Access | Get full text |
ISBN | 9048191785 9789048191789 9789048191772 9048191777 |
ISSN | 1386-291X |
DOI | 10.1007/978-90-481-9178-9 |
Cover
Abstract | The description, automatic identification and further processing of web genres is a novel field of research in computational linguistics, NLP and related areas such as text-technology, digital humanities and web mining. One of the driving forces behind this research is the idea of genre-enabled search engines which enable users to additionally specify web genres that the documents to be retrieved should comply with (e.g., personal homepage, weblog, scientific article etc.). This book offers a thorough foundation of this upcoming field of research on web genres and document types in web-based social networking. It provides theoretical foundations of web genres, presents corpus linguistic approaches to their analysis and computational models for their classification. This includes research in the areas of web genre identification, web genre modelling and related fields such as genres and registers in web based communication social software-based document networks web genre ontologies and classification schemes text-technological models of web genres web content, structure and usage mining web genre classification web as corpus. The book addresses researchers who want to become acquainted with theoretical developments, computational models and their empirical evaluation in this field of research. It also addresses researchers who are interested in standards for the creation of corpora of web documents. Thus, the book concerns readers from many disciplines such as corpus linguistics, computational linguistics, text-technology and computer science. |
---|---|
AbstractList | The volume "Genres on the Web" has been designed for a wide audience, from the expert to the novice. It is a required book for scholars, researchers and students who want to become acquainted with the latest theoretical, empirical and computational advances in the expanding field of web genre research. The study of web genre is an overarching and interdisciplinary novel area of research that spans from corpus linguistics, computational linguistics, NLP, and text-technology, to web mining, webometrics, social network analysis and information studies. This book gives readers a thorough grounding in the latest research on web genres and emerging document types. The book covers a wide range of web-genre focused subjects, such as: - The identification of the sources of web genres - Automatic web genre identification - The presentation of structure-oriented models - Empirical case studies One of the driving forces behind genre research is the idea of a genre-sensitive information system, which incorporates genre cues complementing the current keyword-based search and retrieval applications. The description, automatic identification and further processing of web genres is a novel field of research in computational linguistics, NLP and related areas such as text-technology, digital humanities and web mining. One of the driving forces behind this research is the idea of genre-enabled search engines which enable users to additionally specify web genres that the documents to be retrieved should comply with (e.g., personal homepage, weblog, scientific article etc.). This book offers a thorough foundation of this upcoming field of research on web genres and document types in web-based social networking. It provides theoretical foundations of web genres, presents corpus linguistic approaches to their analysis and computational models for their classification. This includes research in the areas of web genre identification, web genre modelling and related fields such as genres and registers in web based communication social software-based document networks web genre ontologies and classification schemes text-technological models of web genres web content, structure and usage mining web genre classification web as corpus. The book addresses researchers who want to become acquainted with theoretical developments, computational models and their empirical evaluation in this field of research. It also addresses researchers who are interested in standards for the creation of corpora of web documents. Thus, the book concerns readers from many disciplines such as corpus linguistics, computational linguistics, text-technology and computer science. This book presents the latest research on web genres and emerging document types. It covers a wide range of web-genre focused subjects, such as: the identification of the sources of web genres, automatic web genre identification and structure-oriented models. |
Author | Santini, Marina Sharoff, Serge Mehler, Alexander |
Author_xml | – sequence: 1 fullname: Mehler, Alexander – sequence: 2 fullname: Sharoff, Serge – sequence: 3 fullname: Santini, Marina |
BackLink | https://cir.nii.ac.jp/crid/1130000795103008384$$DView record in CiNii |
BookMark | eNpNkM1LAzEQxSO2Ylt78OitiAge1iabpEmOWuoHFLyIegvZ3dl27Tapm6r435t0RZrDZB783mNm-qhjnQWEzgi-JhiLsRIyUThhkiSKxP4A9RUOMip-uC86qEeonCSpIm9d1E8xwYpSkZIj1JtIIZkURB2joffvODzGFKe4h07vwTbgR86OtksYvUJ2grqlqT0M__4BermbPU8fkvnT_eP0Zp4YKjiVSZZzUXJIQRaAS5WCKAtJMymMyAkXGS4LZvIiUyxVBgyRYZjgMIYBZKwo6ABdtcHGr-DbL1299fqrhsy5lddh9f_lVGDHLes3TWUX0OiWIljHQ0VaK6wDr6NBR8dl69g07uMT_FbvgnOw28bUenY7nTDOJQvgRQvaqtJ5FSshNN5IKE5w6CTdYectllfGFk6HOdam-dGtTDGVYh8y3tQhS6-ddYvGbJZec8YEVoL-AkrIgkk |
ContentType | eBook Book |
Copyright | Springer Science+Business Media B.V. 2011 |
Copyright_xml | – notice: Springer Science+Business Media B.V. 2011 |
DBID | I4C 08O RYH |
DEWEY | 005 |
DOI | 10.1007/978-90-481-9178-9 |
DatabaseName | Casalini Torrossa eBooks Institutional Catalogue ciando eBooks CiNii Complete |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Languages & Literatures Computer Science |
EISBN | 9048191785 9789048191789 |
Edition | 1. Aufl. 1 |
Editor | Santini, Marina Sharoff, Serge Mehler, Alexander |
Editor_xml | – sequence: 1 fullname: Mehler, Alexander – sequence: 2 fullname: Santini, Marina – sequence: 3 fullname: Sharoff, Serge |
ExternalDocumentID | 9789048191789 192746 EBC645584 BB04035733 ciando203874 5447097 |
GroupedDBID | -T. .~Z 089 0DA 0DD 0E8 20A 38. A4J AABBV AAFYB AAINA AAMFE ABMNI ACBPT ACDPG AECAB AECMQ AEGQK AEKFX AETDV AEZAY AFNRJ ALMA_UNASSIGNED_HOLDINGS AZZ BBABE C9S C9V CZZ E6I I4C IEZ MYL NUC SAS SBO TPJZQ UZ6 Z83 Z84 Z88 08O T. Z AAJYQ AATVQ ABBUY ABCYT ACDTA ACDUY AEHEY AEJLV AHNNE ATJMZ RYH Z81 |
ID | FETCH-LOGICAL-a37538-bc57f5e2e8de0f92e7fd83b87a7c157b0fd4acdb9429aea1837257faa4eeb4dd3 |
ISBN | 9048191785 9789048191789 9789048191772 9048191777 |
ISSN | 1386-291X |
IngestDate | Fri Nov 08 00:09:58 EST 2024 Tue Jul 29 20:04:26 EDT 2025 Sat May 31 00:05:30 EDT 2025 Thu Jun 26 23:56:12 EDT 2025 Mon Aug 22 12:36:06 EDT 2022 Tue Nov 14 23:02:12 EST 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Keywords | Computer Internet Intranet |
LCCN | 2010933721 |
LCCallNum_Ident | Q |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-a37538-bc57f5e2e8de0f92e7fd83b87a7c157b0fd4acdb9429aea1837257faa4eeb4dd3 |
Notes | Includes bibliographical references and index |
OCLC | 687848719 |
PQID | EBC645584 |
PageCount | 363 |
ParticipantIDs | askewsholts_vlebooks_9789048191789 springer_books_10_1007_978_90_481_9178_9 proquest_ebookcentral_EBC645584 nii_cinii_1130000795103008384 ciando_primary_ciando203874 casalini_monographs_5447097 |
PublicationCentury | 2000 |
PublicationDate | 2010 2011 c2010 2014-07-30 |
PublicationDateYYYYMMDD | 2010-01-01 2011-01-01 2014-07-30 |
PublicationDate_xml | – year: 2010 text: 2010 |
PublicationDecade | 2010 |
PublicationPlace | Dordrecht |
PublicationPlace_xml | – name: Netherlands – name: Dordrecht |
PublicationSeriesTitle | Text, Speech and Language Technology |
PublicationSeriesTitleAlternate | Text,Speech,Language Tech. |
PublicationYear | 2010 2011 2014 |
Publisher | Springer Nature Springer Science + Business Media Springer Springer Netherlands |
Publisher_xml | – name: Springer Nature – name: Springer Science + Business Media – name: Springer – name: Springer Netherlands |
SSID | ssj0000449530 |
Score | 2.0946996 |
Snippet | The description, automatic identification and further processing of web genres is a novel field of research in computational linguistics, NLP and related areas... This book presents the latest research on web genres and emerging document types. It covers a wide range of web-genre focused subjects, such as: the... The volume "Genres on the Web" has been designed for a wide audience, from the expert to the novice. It is a required book for scholars, researchers and... |
SourceID | askewsholts springer proquest nii ciando casalini |
SourceType | Aggregation Database Publisher |
SubjectTerms | Classification Computational Linguistics Computer programming, programs, data Computer Science Corpora (Linguistics) Natural Language Processing (NLP) Nonbook materials World Wide Web |
Subtitle | Computational Models and Empirical Studies |
TableOfContents | 4.1.1 What Is the Purpose of a Genre Taxonomy? -- 4.2 Why Is It Hard to Develop a Web Genre Taxonomy? -- 4.2.1 Difficulties in Defining Genres -- 4.2.2 Difficulties in Developing the Scope and Expressiveness of the Taxonomy -- 4.3 A Use-Centered Development of a Taxonomy of Web Genres -- 4.3.1 Research Design: Naturalistic Field Study -- 4.3.2 Research Informants -- 4.3.3 Data Elicitation -- 4.3.4 Data Analysis -- 4.4 Results -- 4.5 Discussion -- 4.6 Conclusions -- References -- Part III Automatic Web Genre Identification -- 5 Cross-Testing a Genre Classification Model for the Web -- Marina Santini -- 5.1 Introduction -- 5.2 Approximating Genre Population on the Web -- 5.2.1 Noise -- 5.2.2 Description of the Corpora Used for Cross-Testing -- 5.3 The Web as Communication -- 5.3.1 Genre Palette -- 5.3.2 Linguistically- and Functionally-Motivated Features -- 5.4 The Genre Model -- 5.4.1 Methodology -- 5.4.2 Flow and Hypotheses -- 5.5 Results -- 5.5.1 Cross-Testing Performance on Single Labels: BBC and 7-Webgenre Collections -- 5.5.2 Performances of Other Single-Label Models on the 7-Webgenre Collection -- 5.5.3 Cross-Testing Performance on Single Labels: Mapped Web Genres -- 5.5.4 Cross-Testing Performance on Single Labels: HCG and MCG in Isolation -- 5.5.5 The SPIRIT Sample: An Attempt to Assess Multilabelling -- 5.6 Discussion -- 5.7 Conclusion and Future Work -- References -- 6 Formulating Representative Features with Respect to Genre Classification -- Yunhyong Kim and Seamus Ross -- 6.1 Introduction -- 6.2 Defining Genre Classification -- 6.2.1 Document Representation in Conventional Text Classification -- 6.2.2 Harmonic Descriptor Representation (HDR) of Documents -- 6.2.3 Defining Genre -- 6.3 Classifiers -- 6.4 Dataset -- 6.5 Features -- 6.6 Results -- 6.6.1 Overall Accuracy -- 6.6.2 Precision and Recall Jack Grieve, Douglas Biber, Eric Friginal, and Tatiana Nekrasova -- 14.1 Introduction -- 14.2 Corpus Compilation and Analysis -- 14.3 Factor Analysis -- 14.3.1 Method -- 14.3.2 Results -- 14.3.3 Interpretation of Factors -- 14.4 Text Type Analysis -- 14.4.1 Method -- 14.4.2 Results -- 14.4.3 Interpretation of Clusters -- 14.5 Summary of Findings -- References -- 15 Evolving Genres in Online Domains: The Hybrid Genre of the Participatory News Article -- Ian Bruce -- 15.1 Introduction -- 15.1.1 The Systemic Functional Approach to Genre -- 15.1.2 The English for Specific Purposes Approach to Genre -- 15.1.3 Problems with these Existing Approaches to Genre -- 15.1.4 A Solution: Social Genre and Cognitive Genre -- 15.1.5 A Web Genre: The Participatory News Article -- 15.2 Methodology -- 15.3 Results -- 15.3.1 The News Article -- 15.3.2 Reader Comments -- 15.4 Discussion -- 15.5 Conclusion -- References -- Part VI Prospect -- 16 Any Land in Sight? -- Marina Santini, Serge Sharoff, and Alexander Mehler -- 16.1 Web Genre Benchmarks -- 16.1.1 Genre Labels -- 16.1.2 Annotation -- 16.1.3 Representativeness -- 16.2 Work Plan -- 16.2.1 Benefits -- Index 10.4 Features for Classification -- 10.4.1 Features Derived from Structure -- 10.4.2 Features Derived from Content -- 10.5 Classification of Web Sites -- 10.5.1 Classification by Structure -- 10.5.2 Classification by Content -- 10.5.3 Classification by Structure and Content -- 10.6 Conclusion -- References -- 11 Mining Graph Patterns in Web-Based Systems: A Conceptual View -- Matthias Dehmer and Frank Emmert-Streib -- 11.1 Introduction -- 11.2 Mathematical Preliminaries -- 11.3 Structural Graph Measures -- 11.4 Graph Similarity Measures for Web Mining -- 11.4.1 Classical Similarity and Distance Measures for Graphs -- 11.4.2 Graph Similarity Measures Based on Trees -- 11.4.3 Structural Similarity of Generalized Trees -- 11.5 Applications -- 11.6 Conclusion -- References -- 12 Genre Connectivity and Genre Drift in a Web of Genres -- Lennart Björneborn -- 12.1 Introduction -- 12.2 Methodology -- 12.2.1 Source Pages and Target Pages -- 12.2.2 Genre Categorization -- 12.3 Results and Discussion -- 12.3.1 Source Genres, Target Genres and Genre Pairs -- 12.3.2 Web of Genres -- 12.3.3 ``Hook'' Genres and ``Lug'' Genres -- 12.3.4 Genre Drift, Topic Drift and Small-World Implications -- 12.4 Conclusion -- References -- Part V Case Studies of Web Genres -- 13 Genre Emergence in Amateur Flash -- John C. Paolillo, Jonathan Warren, and Breanne Kunz -- 13.1 Genres, Multimedia and the Web -- 13.2 Flash and Newgrounds in Amateur Multimedia -- 13.3 Method -- 13.3.1 Sampling -- 13.3.2 Identifying Potential Emergent Genres -- 13.3.3 Cultural References and Message Content -- 13.4 Results -- 13.4.1 Network Analysis -- 13.4.2 Genre Features -- 13.4.3 Cultural References -- 13.4.4 Genre, Emergence and Social Network -- 13.5 Discussion and Conclusions -- References -- 14 Variation Among Blogs: A Multi-Dimensional Analysis 6.7 Conclusions -- References -- 7 In the Garden and in the Jungle -- Serge Sharoff -- 7.1 Introduction -- 7.2 Text Typology for the Web -- 7.3 An Experiment in Automatic Classification of the Web -- 7.4 Analysis of Results -- 7.4.1 Qualitative Assessment of Texts in Each Category -- 7.4.2 Assessing the Composition of ukWac -- 7.5 Conclusions and Future Research -- References -- 8 Web Genre Analysis: Use Cases, Retrieval Models, and Implementation Issues -- Benno Stein, Sven Meyer zu Eissen, and Nedim Lipka -- 8.1 Introduction -- 8.1.1 Contributions -- 8.2 Use Cases: Genre Analysis in the Retrieval Practice -- 8.2.1 Genre-Enabled Web Search -- 8.2.2 Information Extraction Based on Genre Information -- 8.2.3 Organizing Collections in Both Topic and Genre Dimensions -- 8.2.4 Empower Web Page Abstraction with Genre Information -- 8.3 Construction of Genre Retrieval Models -- 8.3.1 Problems of Genre Retrieval Models and Lessons Learned -- 8.3.2 New Elements for Genre Retrieval Models -- 8.4 Evaluation -- 8.4.1 Improving Generalization Capability -- 8.4.2 Measuring Generalization Capability -- 8.4.3 Experiments -- 8.5 Implementing Genre-Enabled Web Search -- 8.6 Conclusion -- References -- 9 Marrying Relevance and Genre Rankings: An Exploratory Study -- Pavel Braslavski -- 9.1 Introduction -- 9.2 Related Work -- 9.2.1 Genre Classification -- 9.2.2 Readability Scores -- 9.2.3 Genres in Relevance Ranking -- 9.3 Data -- 9.3.1 Functional Styles Sample -- 9.3.2 ROMIP Collection -- 9.4 Formality Score -- 9.5 Results -- 9.5.1 Genre-Related Rankings -- 9.5.2 Merged Rankings -- 9.6 Conclusion -- References -- Part IV Structure-Oriented Models of Web Genres -- 10 Classification of Web Sites at Super-Genre Level -- Christoph Lindemann and Lars Littig -- 10.1 Introduction -- 10.2 Related Work -- 10.3 Dataset Intro -- Foreword -- Personal Note -- Contents -- Contributors -- Part I Introduction -- 1 Riding the Rough Waves of Genre on the Web -- Marina Santini, Alexander Mehler, and Serge Sharoff -- 1.1 Why Is Genre Important? -- 1.1.1 Zooming In: Information on the Web -- 1.2 Trying to Grasp the Ungraspable? -- 1.2.1 In Quest of a Definition of Web Genre for Empirical Studies and Computational Applications -- 1.3 Empirical and Computational Approaches to Genre: Open Issues -- 1.3.1 Web Documents -- 1.3.2 Corpora, Genres and the Web -- 1.3.3 Empirical and Computational Models of Web Genres -- 1.4 Conclusions -- 1.5 Outline of the Volume -- References -- Part II Identifying the Sources of Web Genres -- 2 Conventions and Mutual Expectations -- Jussi Karlgren -- 2.1 Genres Are Not Rule-Bound -- 2.2 So, Let's Ask the Readers -- 2.3 An Editorial, Third Party, View of Genres on the Web -- 2.4 Data Source: Observation of User Actions -- 2.5 Conclusions -- References -- 3 Identification of Web Genres by User Warrant -- Mark A. Rosso and Stephanie W. Haas -- 3.1 Introduction -- 3.2 Criteria for the Identification of Web Genre -- 3.3 Operationalizing Traditional Genre Theory for the World Wide Web -- 3.3.1 A Genre's User Group -- 3.3.2 Genre: Function, Form and Substance -- 3.3.3 Genres on the Web: Further Implications for Research -- 3.4 Developing a Web Genre Palette -- 3.4.1 Collecting Genre Terminology in the Users' Own Words -- 3.4.2 Users Choose the Best of the Collected Genre Terminology -- 3.4.3 User Validation of the Genre Palette -- 3.4.4 A Fourth Study: Determining the Genres' Usefulness for Web Search -- 3.5 Conclusion -- References -- 4 Problems in the Use-Centered Development of a Taxonomy of Web Genres -- Kevin Crowston, Barbara Kwasnik, and Joseph Rubleske -- 4.1 Introduction |
Title | Genres on the Web |
URI | http://digital.casalini.it/9789048191789 http://ebooks.ciando.com/book/index.cfm/bok_id/203874 https://cir.nii.ac.jp/crid/1130000795103008384 https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=645584 http://link.springer.com/10.1007/978-90-481-9178-9 https://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9789048191789&uid=none |
Volume | 42 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELbocqEn3pRSsBBCSCgomzhrmwMSWy2qqi0XSunNchxHXQHZiiwc-PXM-JVNQUJwiZJJlLH8JfZ4PPMNIc9MoW0tpwZ-JIELFM4yUTKZFWLGmZFMto678-T97OgjOz6vzgc2P5ddsqlfmZ9_zCv5H1RBBrhiluw_IJteCgI4B3zhCAjD8Yrxmy4DUz8Ko6P_5Sdbu5W9cRUaonfP1bjxDMz26-XKU4H047DBExuLH6dEl-RyudAwRDvKxg-YoZnkGmtLrEKeTyi-Hf0GLvxs228Q_Yaj9aRE9hhYwPlqOr-NrkNAhcwzeBIAxvNhKkkBfvM5DA4lci3ukB3OxYRcf7s4Xp4l91fOMLA1h3Vy0sk9H9LQhrgJHXiARzp3ya7uP8M8AHPEpkejQvcac0nBkkB3ULMGY6FbrUYLhyt73c6EOL1JJphWcotcs91tcn8Z3MQ9fU6Xidm6v0PeeGjpuqMALQVo6Ws6ApZ6YClopwlYGoC9S87eLU4Pj7JQ4yLTJcfJpjYVbytbWNHYvJWF5W0jylpwzc204nXeNkybppZgOGirYQTmMMq2WjNra9Y05T0y6dadfUCoKUBQwYRVmZZZzGguZ1YUegpy0-Rsjzzd6jP144vbj-_VVqcLuUf2Y1cq-F08b3qvKsZ4Ljnedb2rLj0ZivKXBUZBgIID6HIQ4XGKG6WAHZrwJVr6Au4_iWAopzsEIKvF_HDGqgqfeBExUr5xkVYbGqlkrqCZCtup5MO_KNsnN4bP_hGZbL59twdgQG7qx-Fj_AX2-GOP |
linkProvider | Library Specific Holdings |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Genres+on+the+Web+%3A+computational+models+and+empirical+studies&rft.au=Mehler%2C+Alexander&rft.au=Sharoff%2C+Serge&rft.au=Santini%2C+Marina&rft.date=2010-01-01&rft.pub=Springer&rft.isbn=9789048191772&rft_id=info:doi/10.1007%2F978-90-481-9178-9&rft.externalDocID=BB04035733 |
thumbnail_m | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fvle.dmmserver.com%2Fmedia%2F640%2F97890481%2F9789048191789.jpg |
thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fmedia.springernature.com%2Fw306%2Fspringer-static%2Fcover-hires%2Fbook%2F978-90-481-9178-9 |