Python machine learning by example : build intelligent systems using Python, TensorFlow 2, PyTorch, and scikit-learn
Equipped with the latest updates, this third edition of Python Machine Learning By Example provides a comprehensive course for ML enthusiasts to strengthen their command of ML concepts, techniques, and algorithms.
Saved in:
| Main Author | |
|---|---|
| Format | eBook Book |
| Language | English |
| Published |
Birmingham
Packt Pub
2020
Packt Publishing, Limited Packt Publishing Limited |
| Edition | 3 |
| Subjects | |
| Online Access | Get full text |
| ISBN | 1800209711 9781800209718 |
| DOI | 10.0000/9781800203860 |
Cover
Table of Contents:
- Training a logistic regression model using stochastic gradient descent -- Training a logistic regression model with regularization -- Feature selection using L1 regularization -- Training on large datasets with online learning -- Handling multiclass classification -- Implementing logistic regression using TensorFlow -- Feature selection using random forest -- Summary -- Exercises -- Chapter 6: Scaling Up Prediction to Terabyte Click Logs -- Learning the essentials of Apache Spark -- Breaking down Spark -- Installing Spark -- Launching and deploying Spark programs -- Programming in PySpark -- Learning on massive click logs with Spark -- Loading click logs -- Splitting and caching the data -- One-hot encoding categorical features -- Training and testing a logistic regression model -- Feature engineering on categorical variables with Spark -- Hashing categorical features -- Combining multiple variables - feature interaction -- Summary -- Exercises -- Chapter 7: Predicting Stock Prices with Regression Algorithms -- A brief overview of the stock market and stock prices -- What is regression? -- Mining stock price data -- Getting started with feature engineering -- Acquiring data and generating features -- Estimating with linear regression -- How does linear regression work? -- Implementing linear regression from scratch -- Implementing linear regression with scikit-learn -- Implementing linear regression with TensorFlow -- Estimating with decision tree regression -- Transitioning from classification trees to regression trees -- Implementing decision tree regression -- Implementing a regression forest -- Estimating with support vector regression -- Implementing SVR -- Evaluating regression performance -- Predicting stock prices with the three regression algorithms -- Summary -- Exercises -- Chapter 8: Predicting Stock Prices with Artificial Neural Networks
- Demystifying neural networks -- Starting with a single-layer neural network -- Layers in neural networks -- Activation functions -- Backpropagation -- Adding more layers to a neural network: DL -- Building neural networks -- Implementing neural networks from scratch -- Implementing neural networks with scikit-learn -- Implementing neural networks with TensorFlow -- Picking the right activation functions -- Preventing overfitting in neural networks -- Dropout -- Early stopping -- Predicting stock prices with neural networks -- Training a simple neural network -- Fine-tuning the neural network -- Summary -- Exercise -- Chapter 9: Mining the 20 Newsgroups Dataset with Text Analysis Techniques -- How computers understand language - NLP -- What is NLP? -- The history of NLP -- NLP applications -- Touring popular NLP libraries and picking up NLP basics -- Installing famous NLP libraries -- Corpora -- Tokenization -- PoS tagging -- NER -- Stemming and lemmatization -- Semantics and topic modeling -- Getting the newsgroups data -- Exploring the newsgroups data -- Thinking about features for text data -- Counting the occurrence of each word token -- Text preprocessing -- Dropping stop words -- Reducing inflectional and derivational forms of words -- Visualizing the newsgroups data with t-SNE -- What is dimensionality reduction? -- t-SNE for dimensionality reduction -- Summary -- Exercises -- Chapter 10: Discovering Underlying Topics in the Newsgroups Dataset with Clustering and Topic Modeling -- Learning without guidance - unsupervised learning -- Clustering newsgroups data using k-means -- How does k-means clustering work? -- Implementing k-means from scratch -- Implementing k-means with scikit-learn -- Choosing the value of k -- Clustering newsgroups data using k-means -- Discovering underlying topics in newsgroups -- Topic modeling using NMF
- Topic modeling using LDA -- Summary -- Exercises -- Chapter 11: Machine Learning Best Practices -- Machine learning solution workflow -- Best practices in the data preparation stage -- Best practice 1 - Completely understanding the project goal -- Best practice 2 - Collecting all fields that are relevant -- Best practice 3 - Maintaining the consistency of field values -- Best practice 4 - Dealing with missing data -- Best practice 5 - Storing large-scale data -- Best practices in the training sets generation stage -- Best practice 6 - Identifying categorical features with numerical values -- Best practice 7 - Deciding whether to encode categorical features -- Best practice 8 - Deciding whether to select features, and if so, how to do so -- Best practice 9 - Deciding whether to reduce dimensionality, and if so, how to do so -- Best practice 10 - Deciding whether to rescale features -- Best practice 11 - Performing feature engineering with domain expertise -- Best practice 12 - Performing feature engineering without domain expertise -- Binarization -- Discretization -- Interaction -- Polynomial transformation -- Best practice 13 - Documenting how each feature is generated -- Best practice 14 - Extracting features from text data -- Tf and tf-idf -- Word embedding -- Word embedding with pre-trained models -- Best practices in the model training, evaluation, and selection stage -- Best practice 15 - Choosing the right algorithm(s) to start with -- Naïve Bayes -- Logistic regression -- SVM -- Random forest (or decision tree) -- Neural networks -- Best practice 16 - Reducing overfitting -- Best practice 17 - Diagnosing overfitting and underfitting -- Best practice 18 - Modeling on large-scale datasets -- Best practices in the deployment and monitoring stage -- Best practice 19 - Saving, loading, and reusing models -- Saving and restoring models using pickle
- Saving and restoring models in TensorFlow
- Cover -- Copyright -- Packt Page -- Contributors -- Table of Contents -- Preface -- Chapter 1: Getting Started with Machine Learning and Python -- An introduction to machine learning -- Understanding why we need machine learning -- Differentiating between machine learning and automation -- Machine learning applications -- Knowing the prerequisites -- Getting started with three types of machine learning -- A brief history of the development of machine learning algorithms -- Digging into the core of machine learning -- Generalizing with data -- Overfitting, underfitting, and the bias-variance trade-off -- Overfitting -- Underfitting -- The bias-variance trade-off -- Avoiding overfitting with cross-validation -- Avoiding overfitting with regularization -- Avoiding overfitting with feature selection and dimensionality reduction -- Data preprocessing and feature engineering -- Preprocessing and exploration -- Dealing with missing values -- Label encoding -- One-hot encoding -- Scaling -- Feature engineering -- Polynomial transformation -- Power transforms -- Binning -- Combining models -- Voting and averaging -- Bagging -- Boosting -- Stacking -- Installing software and setting up -- Setting up Python and environments -- Installing the main Python packages -- NumPy -- SciPy -- Pandas -- Scikit-learn -- TensorFlow -- Introducing TensorFlow 2 -- Summary -- Exercises -- Chapter 2: Building a Movie Recommendation Engine with Naïve Bayes -- Getting started with classification -- Binary classification -- Multiclass classification -- Multi-label classification -- Exploring Naïve Bayes -- Learning Bayes' theorem by example -- The mechanics of Naïve Bayes -- Implementing Naïve Bayes -- Implementing Naïve Bayes from scratch -- Implementing Naïve Bayes with scikit-learn -- Building a movie recommender with Naïve Bayes -- Evaluating classification performance
- Tuning models with cross-validation -- Summary -- Exercise -- References -- Chapter 3: Recognizing Faces with Support Vector Machine -- Finding the separating boundary with SVM -- Scenario 1 - identifying a separating hyperplane -- Scenario 2 - determining the optimal hyperplane -- Scenario 3 - handling outliers -- Implementing SVM -- Scenario 4 - dealing with more than two classes -- Scenario 5 - solving linearly non-separable problems with kernels -- Choosing between linear and RBF kernels -- Classifying face images with SVM -- Exploring the face image dataset -- Building an SVM-based image classifier -- Boosting image classification performance with PCA -- Fetal state classification on cardiotocography -- Summary -- Exercises -- Chapter 4: Predicting Online Ad Click-Through with Tree-Based Algorithms -- A brief overview of ad click-through prediction -- Getting started with two types of data - numerical and categorical -- Exploring a decision tree from the root to the leaves -- Constructing a decision tree -- The metrics for measuring a split -- Gini Impurity -- Information Gain -- Implementing a decision tree from scratch -- Implementing a decision tree with scikit-learn -- Predicting ad click-through with a decision tree -- Ensembling decision trees - random forest -- Ensembling decision trees - gradient boosted trees -- Summary -- Exercises -- Chapter 5: Predicting Online Ads Click-Through with Logistic Regression -- Converting categorical features to numerical-one-hot encoding and ordinal encoding -- Classifying data with logistic regression -- Getting started with the logistic function -- Jumping from the logistic function to logistic regression -- Training a logistic regression model -- Training a logistic regression model using gradient descent -- Predicting ad click-through with logistic regression using gradient descent
- Python Machine Learning by Example: Build intelligent systems using Python, TensorFlow 2, PyTorch, and scikit-learn