Early Detection of Numerical Typing Errors Using Data Mining Techniques

This paper studies the applications of data mining techniques in early detection of numerical typing errors by human operators through a quantitative analysis of multichannel electroencephalogram (EEG) recordings. Three feature extraction techniques were developed to capture temporal, morphological,...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on systems, man and cybernetics. Part A, Systems and humans Vol. 41; no. 6; pp. 1199 - 1212
Main Authors Shouyi Wang, Cheng-Jhe Lin, Changxu Wu, Chaovalitwongse, W. A.
Format Journal Article
LanguageEnglish
Published IEEE 01.11.2011
Subjects
Online AccessGet full text
ISSN1083-4427
1558-2426
DOI10.1109/TSMCA.2011.2116006

Cover

More Information
Summary:This paper studies the applications of data mining techniques in early detection of numerical typing errors by human operators through a quantitative analysis of multichannel electroencephalogram (EEG) recordings. Three feature extraction techniques were developed to capture temporal, morphological, and time-frequency (wavelet) characteristics of EEG data. Two most commonly used data mining techniques, namely, linear discriminant analysis (LDA) and support vector machine (SVM), were employed to classify EEG samples associated with correct and erroneous keystrokes. The leave-one-error-pattern-out and leave-one-subject-out cross-validation methods were designed to evaluate the in- and cross-subject classification performances, respectively. For the in-subject classification, the best testing performance had a sensitivity of 62.20% and a specificity of 51.68%, which were achieved by SVM using morphological features. For the cross-subject classification, the best testing performance was achieved by LDA using temporal features, based on which it had a sensitivity of 68.72% and a specificity of 49.45%. In addition, the receiver operating characteristic (ROC) analysis revealed that the averaged values of the area under ROC curves of LDA and SVM for the in- and cross-subject classifications were both greater than 0.60 using the EEG 300 ms prior to the keystrokes. The classification results of this study indicated that the EEG patterns of erroneous keystrokes might be different from those of the correct ones. As a result, it may be possible to predict erroneous keystrokes prior to error occurrence. The classification problem addressed in this study is extremely challenging due to the very limited number of erroneous keystrokes made by each subject and the complex spatiotemporal characteristics of the EEG data. However, the outcome of this study is quite encouraging, and it is promising to develop a prospective early detection system for erroneous keystrokes based on brain-wave signals.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:1083-4427
1558-2426
DOI:10.1109/TSMCA.2011.2116006