Natural Language Processing Recipes - Unlocking Text Data with Machine Learning and Deep Learning Using Python (2nd Edition)
Focus on implementing end-to-end projects using Python and leverage state-of-the-art algorithms. This book teaches you to efficiently use a wide range of natural language processing (NLP) packages to: implement text classification, identify parts of speech, utilize topic modeling, text summarization...
Saved in:
| Main Author | |
|---|---|
| Format | eBook |
| Language | English |
| Published |
Berkeley, CA
Apress, an imprint of Springer Nature
2021
Apress Apress L. P |
| Edition | 2 |
| Subjects | |
| Online Access | Get full text |
| ISBN | 1484273508 9781484273500 9781484273517 1484273516 |
| DOI | 10.1007/978-1-4842-7351-7 |
Cover
Table of Contents:
- Title Page Introduction Table of Contents 1. Extracting the Data 2. Exploring and Processing Text Data 3. Converting Text to Features 4. Advanced Natural Language Processing 5. Implementing Industry Applications 6. Deep Learning for NLP 7. Conclusion and Next-Gen NLP Index
- Intro -- Table of Contents -- About the Authors -- About the Technical Reviewer -- Acknowledgments -- Introduction -- Chapter 1: Extracting the Data -- Introduction -- Client Data -- Free Sources -- Web Scraping -- Recipe 1-1. Collecting Data -- Problem -- Solution -- How It Works -- Step 1-1. Log in to the Twitter developer portal -- Step 1-2. Execute query in Python -- Recipe 1-2. Collecting Data from PDFs -- Problem -- Solution -- How It Works -- Step 2-1. Install and import all the necessary libraries -- Step 2-2. Extract text from a PDF file -- Recipe 1-3. Collecting Data from Word Files -- Problem -- Solution -- How It Works -- Step 3-1. Install and import all the necessary libraries -- Step 3-2. Extract text from a Word file -- Recipe 1-4. Collecting Data from JSON -- Problem -- Solution -- How It Works -- Step 4-1. Install and import all the necessary libraries -- Step 4-2. Extract text from a JSON file -- Recipe 1-5. Collecting Data from HTML -- Problem -- Solution -- How It Works -- Step 5-1. Install and import all the necessary libraries -- Step 5-2. Fetch the HTML file -- Step 5-3. Parse the HTML file -- Step 5-4. Extract a tag value -- Step 5-5. Extract all instances of a particular tag -- Step 5-6. Extract all text from a particular tag -- Recipe 1-6. Parsing Text Using Regular Expressions -- Problem -- Solution -- How It Works -- Tokenizing -- Extracting Email IDs -- Replacing Email IDs -- Extracting Data from an eBook and Performing regex -- Recipe 1-7. Handling Strings -- Problem -- Solution -- How It Works -- Replacing Content -- Concatenating Two Strings -- Searching for a Substring in a String -- Recipe 1-8. Scraping Text from the Web -- Problem -- Solution -- How It Works -- Step 8-1. Install all the necessary libraries -- Step 8-2. Import the libraries -- Step 8-3. Identify the URL to extract the data
- Step 8-4. Request the URL and download the content using Beautiful Soup -- Step 8-5. Understand the website's structure to extract the required information -- Step 8-6. Use Beautiful Soup to extract and parse the data from HTML tags -- Step 8-7. Convert lists to a data frame and perform an analysis that meets business requirements -- Step 8-8. Download the data frame -- Chapter 2: Exploring and Processing Text Data -- Recipe 2-1. Converting Text Data to Lowercase -- Problem -- Solution -- How It Works -- Step 1-1. Read/create the text data -- Step 1-2. Execute the lower() function on the text data -- Recipe 2-2. Removing Punctuation -- Problem -- Solution -- How It Works -- Step 2-1. Read/create the text data -- Step 2-2. Execute the replace() function on the text data -- Recipe 2-3. Removing Stop Words -- Problem -- Solution -- How It Works -- Step 3-1. Read/create the text data -- Step 3-2. Remove punctuation from the text data -- Recipe 2-4. Standardizing Text -- Problem -- Solution -- How It Works -- Step 4-1. Create a custom lookup dictionary -- Step 4-2. Create a custom function for text standardization -- Step 4-3. Run the text_std function -- Recipe 2-5. Correcting Spelling -- Problem -- Solution -- How It Works -- Step 5-1. Read/create the text data -- Step 5-2. Execute spelling correction on the text data -- Recipe 2-6. Tokenizing Text -- Problem -- Solution -- How It Works -- Step 6-1. Read/create the text data -- Step 6-2. Tokenize the text data -- Recipe 2-7. Stemming -- Problem -- Solution -- How It Works -- Step 7-1. Read the text data -- Step 7-2. Stem the text -- Recipe 2-8. Lemmatizing -- Problem -- Solution -- How It Works -- Step 8-1. Read the text data -- Step 8-2. Lemmatize the data -- Recipe 2-9. Exploring Text Data -- Problem -- Solution -- How It Works -- Step 9-1. Read the text data -- Step 9-2. Import necessary libraries
- Recipe 4-3. Tagging Part of Speech -- Problem -- Solution -- How It Works -- Step 3-1. Store the text in a variable -- Step 3-2. Import NLTK for POS -- Recipe 4-4. Extracting Entities from Text -- Problem -- Solution -- How It Works -- Step 4-1. Read/create the text data -- Step 4-2. Extract the entities -- Using NLTK -- Using spaCy -- Recipe 4-5. Extracting Topics from Text -- Problem -- Solution -- How It Works -- Step 5-1. Create the text data -- Step 5-2. Clean and preprocess the data -- Step 5-3. Prepare the document term matrix -- Step 5-4. Create the LDA model -- Recipe 4-6. Classifying Text -- Problem -- Solution -- How It Works -- Step 6-1. Collect and understand the data -- Step 6-2. Text processing and feature engineering -- Step 6-3. Model training -- Recipe 4-7. Carrying Out Sentiment Analysis -- Problem -- Solution -- How It Works -- Step 7-1. Create the sample data -- Step 7-2. Clean and preprocess the data -- Step 7-3. Get the sentiment scores -- Recipe 4-8. Disambiguating Text -- Problem -- Solution -- How It Works -- Step 8-1. Import libraries -- Step 8-2. Disambiguate word sense -- Recipe 4-9. Converting Speech to Text -- Problem -- Solution -- How It Works -- Step 9-1. Define the business problem -- Step 9-2. Install and import necessary libraries -- Step 9-3. Run the code -- Recipe 4-10. Converting Text to Speech -- Problem -- Solution -- How It Works -- Step 10-1. Install and import necessary libraries -- Step 10-2. Run the code with the gTTs function -- Recipe 4-11. Translating Speech -- Problem -- Solution -- How It Works -- Step 11-1. Install and import necessary libraries -- Step 11-2. Input text -- Step 11-3. Run the goslate function -- Chapter 5: Implementing Industry Applications -- Recipe 5-1. Implementing Multiclass Classification -- Problem -- Solution -- How It Works -- Step 1-1. Get the data from Kaggle
- Step 9-3 Check the number of words in the data -- Step 9-4. Compute the frequency of all words in the reviews -- Step 9-5. Consider words with length greater than 3 and plot -- Step 9-6. Build a word cloud -- Recipe 2-10. Dealing with Emojis and Emoticons -- Problem -- Solution -- How It Works -- Step 10-A1. Read the text data -- Step 10-A2. Install and import necessary libraries -- Step 10-A3. Write a function that coverts emojis into words -- Step 10-A4. Pass text with an emoji to the function -- Problem -- Solution -- How It Works -- Step 10-B1. Read the text data -- Step 10-B2. Install and import necessary libraries -- Step 10-B3. Write a function to remove emojis -- Step 10-B4. Pass text with an emoji to the function -- Problem -- Solution -- How It Works -- Step 10-C1. Read the text data -- Step 10-C2. Install and import necessary libraries -- Step 10-C3. Write function to convert emoticons into word -- Step 10-C4. Pass text with emoticons to the function -- Problem -- Solution -- How It Works -- Step 10-D1 Read the text data -- Step 10-D2. Install and import necessary libraries -- Step 10-D3. Write function to remove emoticons -- Step 10-D4. Pass text with emoticons to the function -- Problem -- Solution -- How It Works -- Step 10-E1. Read the text data -- Step 10-E2. Install and import necessary libraries -- Step 10-E3. Find all emojis and determine their meaning -- Recipe 2-11. Building a Text Preprocessing Pipeline -- Problem -- Solution -- How It Works -- Step 11-1. Read/create the text data -- Step 11-2. Process the text -- Chapter 3: Converting Text to Features -- Recipe 3-1. Converting Text to Features Using One-Hot Encoding -- Problem -- Solution -- How It Works -- Step 1-1. Store the text in a variable -- Step 1-2. Execute a function on the text data -- Recipe 3-2. Converting Text to Features Using a Count Vectorizer -- Problem
- Step 1-2. Import the libraries
- Solution -- How It Works -- Recipe 3-3. Generating n-grams -- Problem -- Solution -- How It Works -- Step 3-1. Generate n-grams using TextBlob -- Step 3-2. Generate bigram-based features for a document -- Recipe 3-4. Generating a Co-occurrence Matrix -- Problem -- Solution -- How It Works -- Step 4-1. Import the necessary libraries -- Step 4-2. Create function for a co-occurrence matrix -- Step 4-3. Generate a co-occurrence matrix -- Recipe 3-5. Hash Vectorizing -- Problem -- Solution -- How It Works -- Step 5-1. Import the necessary libraries and create a document -- Step 5-2. Generate a hash vectorizer matrix -- Recipe 3-6. Converting Text to Features Using TF-IDF -- Problem -- Solution -- How It Works -- Step 6-1. Read the text data -- Step 6-2. Create the features -- Recipe 3-7. Implementing Word Embeddings -- Problem -- Solution -- How It Works -- skip-gram -- Continuous Bag of Words (CBOW) -- Recipe 3-8. Implementing fastText -- Problem -- Solution -- How It Works -- Recipe 3-9. Converting Text to Features Using State-of-the-Art Embeddings -- Problem -- Solution -- ELMo -- Sentence Encoders -- doc2vec -- Sentence-BERT -- Universal Encoder -- InferSent -- Open-AI GPT -- How It Works -- Step 9-1. Import a notebook and data to Google Colab -- Step 9-2. Install and import libraries -- Step 9-3. Read text data -- Step 9-4. Process text data -- Step 9-5. Generate a feature vector -- Sentence-BERT -- Universal Encoder -- Infersent -- Open-AI GPT -- Step 9-6. Generate a feature vector function automatically using a selected embedding method -- Chapter 4: Advanced Natural Language Processing -- Recipe 4-1. Extracting Noun Phrases -- Problem -- Solution -- How It Works -- Recipe 4-2. Finding Similarity Between Texts -- Solution -- How It Works -- Step 2-1. Create/read the text data -- Step 2-2. Find similarities -- Phonetic Matching