Automated classification of web pages in hierarchical browsing

Purpose - The purpose of this study is twofold: to investigate whether it is meaningful to use the Engineering Index (Ei) classification scheme for browsing, and then, if proven useful, to investigate the performance of an automated classification algorithm based on the Ei classification scheme.Desi...

Full description

Saved in:
Bibliographic Details
Published inJournal of documentation Vol. 65; no. 6; pp. 901 - 925
Main Authors Golub, Koraljka, Lykke, Marianne
Format Journal Article
LanguageEnglish
Published Bingley Emerald Group Publishing Limited 16.10.2009
Emerald
Subjects
Online AccessGet full text
ISSN0022-0418
1758-7379
1758-7379
DOI10.1108/00220410910998915

Cover

More Information
Summary:Purpose - The purpose of this study is twofold: to investigate whether it is meaningful to use the Engineering Index (Ei) classification scheme for browsing, and then, if proven useful, to investigate the performance of an automated classification algorithm based on the Ei classification scheme.Design methodology approach - A user study was conducted in which users solved four controlled searching tasks. The users browsed the Ei classification scheme in order to examine the suitability of the classification systems for browsing. The classification algorithm was evaluated by the users who judged the correctness of the automatically assigned classes.Findings - The study showed that the Ei classification scheme is suited for browsing. Automatically assigned classes were on average partly correct, with some classes working better than others. Success of browsing showed to be correlated and dependent on classification correctness.Research limitations implications - Further research should address problems of disparate evaluations of one and the same web page. Additional reasons behind browsing failures in the Ei classification scheme also need further investigation.Practical implications - Improvements for browsing were identified: describing class captions and or listing their subclasses from start; allowing for searching for words from class captions with synonym search (easily provided for Ei since the classes are mapped to thesauri terms); when searching for class captions, returning the hierarchical tree expanded around the class in which caption the search term is found. The need for improvements of classification schemes was also indicated.Originality value - A user-based evaluation of automated subject classification in the context of browsing has not been conducted before; hence the study also presents new findings concerning methodology.
Bibliography:filenameID:2780650602
href:00220410910998915.pdf
istex:7DA7B835F1CA428545F4C99A8A1B832257119809
ark:/67375/4W2-ZNFZL72V-1
original-pdf:2780650602.pdf
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ISSN:0022-0418
1758-7379
1758-7379
DOI:10.1108/00220410910998915