MCut: A Thresholding Strategy for Multi-label Classification
The multi-label classification is a frequent task in machine learning notably in text categorization. When binary classifiers are not suited, an alternative consists in using a multiclass classifier that provides for each document a score per category and then in applying a thresholding strategy in...
        Saved in:
      
    
          | Published in | Advances in Intelligent Data Analysis XI pp. 172 - 183 | 
|---|---|
| Main Authors | , , | 
| Format | Book Chapter | 
| Language | English | 
| Published | 
        Berlin, Heidelberg
          Springer Berlin Heidelberg
    
        2012
     | 
| Series | Lecture Notes in Computer Science | 
| Subjects | |
| Online Access | Get full text | 
| ISBN | 9783642341557 3642341551  | 
| ISSN | 0302-9743 1611-3349  | 
| DOI | 10.1007/978-3-642-34156-4_17 | 
Cover
| Summary: | The multi-label classification is a frequent task in machine learning notably in text categorization. When binary classifiers are not suited, an alternative consists in using a multiclass classifier that provides for each document a score per category and then in applying a thresholding strategy in order to select the set of categories which must be assigned to the document. The common thresholding strategies, such as RCut, PCut and SCut methods, need a training step to determine the value of the threshold. To overcome this limit, we propose a new strategy, called MCut which automatically estimates a value for the threshold. This method does not have to be trained and does not need any parametrization. Experiments performed on two textual corpora, XML Mining 2009 and RCV1 collections, show that the MCut strategy results are on par with the state of the art but MCut is easy to implement and parameter free. | 
|---|---|
| ISBN: | 9783642341557 3642341551  | 
| ISSN: | 0302-9743 1611-3349  | 
| DOI: | 10.1007/978-3-642-34156-4_17 |