Hands-On Ensemble Learning with Python Build Highly Optimized Ensemble Machine Learning Models Using Scikit-Learn and Keras

Ensemble learning can provide the necessary methods to improve the accuracy and performance of existing models. In this book, you'll understand how to combine different machine learning algorithms to produce more accurate results from your models.

Saved in:
Bibliographic Details
Main Authors Kyriakides, George, Margaritis, Konstantinos G
Format eBook
LanguageEnglish
Published Birmingham Packt Publishing, Limited 2019
Packt Publishing Limited
Packt Publishing
Edition1
Subjects
Online AccessGet full text
ISBN1789612853
9781789612851
DOI10.0000/9781789617887

Cover

Table of Contents:
  • Cover -- Title Page -- Copyright and Credits -- About Packt -- Contributors -- Table of Contents -- Preface -- Section 1: Introduction and Required Software Tools -- Chapter 1: A Machine Learning Refresher -- Technical requirements -- Learning from data -- Popular machine learning datasets -- Diabetes -- Breast cancer -- Handwritten digits -- Supervised and unsupervised learning -- Supervised learning -- Unsupervised learning -- Dimensionality reduction -- Performance measures -- Cost functions -- Mean absolute error -- Mean squared error -- Cross entropy loss -- Metrics -- Classification accuracy -- Confusion matrix -- Sensitivity, specificity, and area under the curve -- Precision, recall, and the F1 score -- Evaluating models -- Machine learning algorithms -- Python packages -- Supervised learning algorithms -- Regression -- Support vector machines -- Neural networks -- Decision trees -- K-Nearest Neighbors -- K-means -- Summary -- Chapter 2: Getting Started with Ensemble Learning -- Technical requirements -- Bias, variance, and the trade-off -- What is bias? -- What is variance? -- Trade-off -- Ensemble learning -- Motivation -- Identifying bias and variance -- Validation curves -- Learning curves -- Ensemble methods -- Difficulties in ensemble learning -- Weak or noisy data -- Understanding interpretability -- Computational cost -- Choosing the right models -- Summary -- Section 2: Non-Generative Methods -- Chapter 3: Voting -- Technical requirements -- Hard and soft voting -- Hard voting -- Soft voting -- ​Python implementation -- Custom hard voting implementation -- Analyzing our results using Python -- Using scikit-learn -- Hard voting implementation -- Soft voting implementation -- Analyzing our results -- Summary -- Chapter 4: Stacking -- Technical requirements -- Meta-learning -- Stacking -- Creating metadata
  • Deciding on an ensemble's composition -- Selecting base learners -- Selecting the meta-learner -- Python implementation -- Stacking for regression -- Stacking for classification -- Creating a stacking regressor class for scikit-learn -- Summary -- Section 3: Generative Methods -- Chapter 5: Bagging -- Technical requirements -- Bootstrapping -- Creating bootstrap samples -- Bagging -- Creating base learners -- Strengths and weaknesses -- Python implementation -- Implementation -- Parallelizing the implementation -- Using scikit-learn -- Bagging for classification -- Bagging for regression -- Summary -- Chapter 6: Boosting -- Technical requirements -- AdaBoost -- Weighted sampling -- Creating the ensemble -- Implementing AdaBoost in Python -- Strengths and weaknesses -- Gradient boosting -- Creating the ensemble -- Further reading -- Implementing gradient boosting in Python -- Using scikit-learn -- Using AdaBoost -- Using gradient boosting -- XGBoost -- Using XGBoost for regression -- Using XGBoost for classification -- Other boosting libraries -- Summary -- Chapter 7: Random Forests -- Technical requirements -- Understanding random forest trees -- Building trees -- Illustrative example -- Extra trees -- Creating forests -- Analyzing forests -- Strengths and weaknesses -- Using scikit-learn -- Random forests for classification -- Random forests for regression -- Extra trees for classification -- Extra trees regression -- Summary -- Section 4: Clustering -- Chapter 8: Clustering -- Technical requirements -- Consensus clustering -- Hierarchical clustering -- K-means clustering -- Strengths and weaknesses -- Using scikit-learn -- Using voting -- Using OpenEnsembles -- Using graph closure and co-occurrence linkage -- Graph closure -- Co-occurrence matrix linkage -- Summary -- Section 5: Real World Applications
  • Chapter 9: Classifying Fraudulent Transactions -- Technical requirements -- Getting familiar with the dataset -- Exploratory analysis -- Evaluation methods -- Voting -- Testing the base learners -- Optimizing the decision tree -- Creating the ensemble -- Stacking -- Bagging -- Boosting -- XGBoost -- Using random forests -- Comparative analysis of ensembles -- Summary -- Chapter 10: Predicting Bitcoin Prices -- Technical requirements -- Time series data -- Bitcoin data analysis -- Establishing a baseline -- The simulator -- Voting -- Improving voting -- Stacking -- Improving stacking -- Bagging -- Improving bagging -- Boosting -- Improving boosting -- Random forests -- Improving random forest -- Summary -- Chapter 11: Evaluating Sentiment on Twitter -- Technical requirements -- Sentiment analysis tools -- Stemming -- Getting Twitter data -- Creating a model -- Classifying tweets in real time -- Summary -- Chapter 12: Recommending Movies with Keras -- Technical requirements -- Demystifying recommendation systems -- Neural recommendation systems -- Using Keras for movie recommendations -- Creating the dot model -- Creating the dense model -- Creating a stacking ensemble -- Summary -- Chapter 13: Clustering World Happiness -- Technical requirements -- Understanding the World Happiness Report -- Creating the ensemble -- Gaining insights -- Summary -- Another Book You May Enjoy -- Index
  • Hands-On Ensemble Learning with Python: Build highly optimized ensemble machine learning models using scikit-learn and Keras