Machine learning with R : learn how to use R to apply powerful machine learning methods and gain an insight into real-world applications

Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating...

Full description

Saved in:
Bibliographic Details
Main Author Lantz, Brett (Author)
Format Electronic eBook
LanguageEnglish
Published Birmingham, UK : Packt Publishing, 2013.
SeriesCommunity experience distilled.
Subjects
Online AccessFull text
ISBN9781782162155
1782162151
9781461949657
1461949653
1306070333
9781306070331
9781680153583
1680153587
1782162143
9781782162148
Physical Description1 online resource (vii, 375 pages) : illustrations

Cover

Table of Contents:
  • Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Introducing Machine Learning; The origins of machine learning; Uses and abuses of machine learning; Ethical considerations; How do machines learn?; Abstraction and knowledge representation; Generalization; Assessing the success of learning; Steps to apply machine learning to your data; Choosing a machine learning algorithm; Thinking about the input data; Thinking about types of machine learning algorithms; Matching your data to an appropriate algorithm.
  • Using R for machine learningInstalling and loading R packages; Installing an R package; Installing a package using the point-and-click interface; Loading an R package; Summary; Chapter 2: Managing and Understanding Data; R data structures; Vectors; Factors; Lists; Data frames; Matrixes and arrays; Managing data with R; Saving and loading R data structures; Importing and saving data from CSV files; Importing data from SQL databases; Exploring and understanding data; Exploring the structure of data; Exploring numeric variables; Measuring the central tendency
  • mean and median.
  • Measuring spread
  • quartiles and the five-number summaryVisualizing numeric variables
  • boxplots; Visualizing numeric variables
  • histograms; Understanding numeric data
  • uniform and normal distributions; Measuring spread
  • variance and standard deviation; Exploring categorical variables; Measuring the central tendency
  • the mode; Exploring relationships between variables; Visualizing relationships
  • scatterplots; Examining relationships
  • two-way cross-tabulations; Summary; Chapter 3: Lazy Learning
  • Classification using Nearest Neighbors; Understanding classification using nearest neighbors.
  • The kNN algorithmCalculating distance; Choosing an appropriate k; Preparing data for use with kNN; Why is the kNN algorithm lazy?; Diagnosing breast cancer with the kNN algorithm; Step 1
  • collecting data; Step 2
  • exploring and preparing the data; Transformation
  • normalizing numeric data; Data preparation
  • creating training and test datasets; Step 3
  • training a model on the data; Step 4
  • evaluating model performance; Step 5
  • improving model performance; Transformation
  • z-score standardization; Testing alternative values of k; Summary.
  • Chapter 4: Probabilistic Learning
  • Classification using Naive BayesUnderstanding naive Bayes; Basic concepts of Bayesian methods; Probability; Joint probability; Conditional probability with Bayes' theorem; The naive Bayes algorithm; The naive Bayes classification; The Laplace estimator; Using numeric features with naive Bayes; Example
  • filtering mobile phone spam with the naive Bayes algorithm; Step 1
  • collecting data; Step 2
  • exploring and preparing the data; Data preparation
  • processing text data for analysis; Data preparation
  • creating training and test datasets.
  • Visualizing text data
  • word clouds.