Understand the Problem by Understanding the Data

This chapter is about opening up your new data set so you can see what is inside, get an appreciation for what you will be able to do with the data, and start thinking about how you will approach model building with it. It has two purposes. One is to familiarize you with data sets that will be used...

Full description

Saved in:
Bibliographic Details
Published inMachine Learning in Python pp. 23 - 73
Main Author Bowles, Michael
Format Book Chapter
LanguageEnglish
Published United States John Wiley & Sons, Incorporated 2015
John Wiley & Sons, Inc
Subjects
Online AccessGet full text
ISBN1118961749
9781118961742
DOI10.1002/9781119183600.ch2

Cover

More Information
Summary:This chapter is about opening up your new data set so you can see what is inside, get an appreciation for what you will be able to do with the data, and start thinking about how you will approach model building with it. It has two purposes. One is to familiarize you with data sets that will be used later as examples of different types of problems to be solved using the algorithms. The other purpose is to demonstrate some of the tools available in Python for data exploration. The chapter uses a simple example to review some basic problem structure, nomenclature, and characteristics of a machine learning data set. After establishing some common language, the chapter goes one by one through several different types of function approximation problems. The chapter also introduces several visualization techniques, such as quantile‐quantile (Q‐Q) plots and parallel coordinates plots.
ISBN:1118961749
9781118961742
DOI:10.1002/9781119183600.ch2