Python feature engineering cookbook

Feature engineering, the process of transforming variables and creating features, albeit time-consuming, ensures that your machine learning models perform seamlessly. This second edition of Python Feature Engineering Cookbook will take the struggle out of feature engineering by showing you how to us...

Full description

Saved in:
Bibliographic Details
Main Author: Galli, Soledad, (Author)
Format: eBook
Language: English
Published: Birmingham, UK : Packt Publishing Ltd., 2022.
Edition: Second edition.
Subjects:
ISBN: 9781804611302
Physical Description: 1 online resource (386 pages) : illustrations

Cover

Table of contents

LEADER 10063cam a22003977i 4500
001 kn-on1350412247
003 OCoLC
005 20240717213016.0
006 m o d
007 cr cn|||||||||
008 221108s2022 enka o 001 0 eng d
040 |a ORMDA  |b eng  |e rda  |e pn  |c ORMDA  |d OCLCF  |d OCLCO  |d OCLCL  |d DXU 
020 |z 9781804611302 
035 |a (OCoLC)1350412247 
100 1 |a Galli, Soledad,  |e author. 
245 1 0 |a Python feature engineering cookbook /  |c Soledad Galli. 
250 |a Second edition. 
264 1 |a Birmingham, UK :  |b Packt Publishing Ltd.,  |c 2022. 
300 |a 1 online resource (386 pages) :  |b illustrations 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
500 |a Includes index. 
506 |a Plný text je dostupný pouze z IP adres počítačů Univerzity Tomáše Bati ve Zlíně nebo vzdáleným přístupem pro zaměstnance a studenty 
520 |a Feature engineering, the process of transforming variables and creating features, albeit time-consuming, ensures that your machine learning models perform seamlessly. This second edition of Python Feature Engineering Cookbook will take the struggle out of feature engineering by showing you how to use open source Python libraries to accelerate the process via a plethora of practical, hands-on recipes. This updated edition begins by addressing fundamental data challenges such as missing data and categorical values, before moving on to strategies for dealing with skewed distributions and outliers. The concluding chapters show you how to develop new features from various types of data, including text, time series, and relational databases. With the help of numerous open source Python libraries, you'll learn how to implement each feature engineering method in a performant, reproducible, and elegant manner. By the end of this Python book, you will have the tools and expertise needed to confidently build end-to-end and reproducible feature engineering pipelines that can be deployed into production. 
505 0 |a Cover -- Title Page -- Copyright and Credits -- Contributors -- Table of Contents -- Preface -- Chapter 1: Imputing Missing Data -- Technical requirements -- Removing observations with missing data -- How to do it... -- How it works... -- Performing mean or median imputation -- How to do it... -- How it works... -- Imputing categorical variables -- How to do it... -- How it works... -- Replacing missing values with an arbitrary number -- How to do it... -- How it works... -- Finding extreme values for imputation -- How to do it... -- How it works... -- Marking imputed values -- How to do it... -- How it works... -- Performing multivariate imputation by chained equations -- How to do it... -- How it works... -- See also -- Estimating missing data with nearest neighbors -- How to do it... -- How it works... -- Chapter 2: Encoding Categorical Variables -- Technical requirements -- Creating binary variables through one-hot encoding -- How to do it... -- How it works... -- There's more... -- Performing one-hot encoding of frequent categories -- How to do it... -- How it works... -- There's more... -- Replacing categories with counts or the frequency of observations -- How to do it... -- How it works... -- Replacing categories with ordinal numbers -- How to do it... -- How it works... -- There's more... -- Performing ordinal encoding based on the target value -- How to do it... -- How it works... -- See also -- Implementing target mean encoding -- How to do it... -- How it works... -- There's more... -- Encoding with the Weight of Evidence -- How to do it... -- How it works... -- See also -- Grouping rare or infrequent categories -- How to do it... -- How it works... -- Performing binary encoding -- How to do it... -- How it works... -- See also -- Chapter 3: Transforming Numerical Variables -- Transforming variables with the logarithm function. 
505 8 |a Getting ready -- How to do it... -- How it works... -- There's more... -- Transforming variables with the reciprocal function -- How to do it... -- How it works... -- Using the square root to transform variables -- How to do it... -- How it works... -- Using power transformations -- How to do it... -- How it works... -- Performing Box-Cox transformation -- How to do it... -- How it works... -- There's more... -- Performing Yeo-Johnson transformation -- How to do it... -- How it works... -- There's more... -- Chapter 4: Performing Variable Discretization -- Technical requirements -- Performing equal-width discretization -- How to do it... -- How it works... -- See also -- Implementing equal-frequency discretization -- How to do it... -- How it works... -- Discretizing the variable into arbitrary intervals -- How to do it... -- How it works... -- Performing discretization with k-means clustering -- How to do it... -- How it works... -- See also -- Implementing feature binarization -- Getting ready -- How to do it... -- How it works... -- Using decision trees for discretization -- How to do it... -- How it works... -- There's more... -- Chapter 5: Working with Outliers -- Technical requirements -- Visualizing outliers with boxplots -- How to do it... -- How it works... -- Finding outliers using the mean and standard deviation -- How to do it... -- How it works... -- Finding outliers with the interquartile range proximity rule -- How to do it... -- How it works... -- Removing outliers -- How to do it... -- How it works... -- Capping or censoring outliers -- How to do it... -- How it works... -- There's more... -- Capping outliers using quantiles -- How to do it... -- How it works... -- Chapter 6: Extracting Features from Date and Time Variables -- Technical requirements -- Extracting features from dates with pandas -- Getting ready -- How to do it... -- How it works. 
505 8 |a There's more... -- See also -- Extracting features from time with pandas -- Getting ready -- How to do it... -- How it works... -- There's more... -- Capturing the elapsed time between datetime variables -- How to do it... -- How it works... -- See also -- Working with time in different time zones -- How to do it... -- How it works... -- See also -- Automating feature extraction with Feature-engine -- How to do it... -- How it works... -- Chapter 7: Performing Feature Scaling -- Technical requirements -- Standardizing the features -- How to do it... -- How it works... -- Scaling to the maximum and minimum values -- How to do it... -- How it works... -- Scaling with the median and quantiles -- How to do it... -- How it works... -- Performing mean normalization -- How to do it... -- How it works... -- There's more... -- Implementing maximum absolute scaling -- Getting ready -- How to do it... -- How it works... -- There's more... -- Scaling to vector unit length -- How to do it... -- How it works... -- Chapter 8: Creating New Features -- Technical requirements -- Combining features with mathematical functions -- Getting ready -- How to do it... -- How it works... -- See also -- Comparing features to reference variables -- How to do it... -- How it works... -- See also -- Performing polynomial expansion -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Combining features with decision trees -- Getting ready -- How to do it... -- How it works... -- Creating periodic features from cyclical variables -- Getting ready -- How to do it... -- How it works... -- See also -- Creating spline features -- Getting ready -- How to do it... -- How it works... -- See also -- Chapter 9: Extracting Features from Relational Data with Featuretools -- Technical requirements -- Setting up an entity set and creating features automatically. 
505 8 |a Getting ready -- How to do it... -- How it works... -- See also -- Creating features with general and cumulative operations -- Getting ready -- How to do it... -- How it works... -- Combining numerical features -- How to do it... -- How it works... -- Extracting features from date and time -- How to do it... -- How it works... -- There's more... -- Extracting features from text -- Getting ready -- How to do it... -- How it works... -- Creating features with aggregation primitives -- Getting ready -- How to do it... -- How it works... -- Chapter 10: Creating Features from a Time Series with tsfresh -- Technical requirements -- Extracting features automatically from a time series -- Getting ready -- How to do it... -- How it works... -- See also -- Creating and selecting features for a time series -- How to do it... -- How it works... -- See also -- Tailoring feature creation to different time series -- How to do it... -- How it works... -- Creating pre-selected features -- How to do it... -- How it works... -- Embedding feature creation in a scikit-learn pipeline -- How to do it... -- How it works... -- See also -- Chapter 11: Extracting Features from Text Variables -- Technical requirements -- Counting characters, words, and vocabulary -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Estimating text complexity by counting sentences -- Getting ready -- How to do it... -- How it works... -- There's more... -- Creating features with bag-of-words and n-grams -- Getting ready -- How to do it... -- How it works... -- See also -- Implementing term frequency-inverse document frequency -- Getting ready -- How to do it... -- How it works... -- See also -- Cleaning and stemming text variables -- Getting ready -- How to do it... -- How it works... -- Index -- About Packt -- Other Books You May Enjoy. 
590 |a Knovel  |b Knovel (All titles) 
650 0 |a Python (Computer program language) 
650 0 |a Application software  |x Development. 
650 0 |a Machine learning. 
655 7 |a elektronické knihy  |7 fd186907  |2 czenas 
655 9 |a electronic books  |2 eczenas 
856 4 0 |u https://proxy.k.utb.cz/login?url=https://app.knovel.com/hotlink/toc/id:kpPFEC0004/python-feature-engineering?kpromoter=marc  |y Full text