A General Algorithm of Association Rule-Based Machine Learning Dedicated for Text Classification

Many data mining techniques and machine learning algorithms have been developed to classify textual data involving decision tree, support vector machine, K-Nearest neighbour, in addition to machine learning-based algorithms. Association rules based machine learning is accomplished in two phases; tra...

Full description

Saved in:
Bibliographic Details
Published inJournal of physics. Conference series Vol. 1773; no. 1; p. 12011
Main Authors hamid, Zeyad, Khafaji, Hussein K
Format Journal Article
LanguageEnglish
Published Bristol IOP Publishing 01.02.2021
Subjects
Online AccessGet full text
ISSN1742-6588
1742-6596
1742-6596
DOI10.1088/1742-6596/1773/1/012011

Cover

More Information
Summary:Many data mining techniques and machine learning algorithms have been developed to classify textual data involving decision tree, support vector machine, K-Nearest neighbour, in addition to machine learning-based algorithms. Association rules based machine learning is accomplished in two phases; training phase and testing phase that may be reinforced to enhance the classification accuracy according to new minimum support and confidence. Association rules mining/processing, in its various applications, passes through two massive computation steps; frequent itemsets mining and association rules extraction. This paper presents a general algorithm for association rules-based machine learning dedicated to text classification. To verify the efficiency of the algorithm, different text datasets were used such as tweets dataset for sentiment classification, pdf documents and HTML documents. Experiments of sentiment classification showed that the classifier constructed according to minsup threshold =%700 and minconf threshold =50% gives the best performance with F1 = 0.9861811 while the experiments of HTML and PDF appeared accurate classification equal to (94%).
Bibliography:ObjectType-Conference Proceeding-1
SourceType-Scholarly Journals-1
content type line 14
ISSN:1742-6588
1742-6596
1742-6596
DOI:10.1088/1742-6596/1773/1/012011