Practical web scraping for data science : best practices and examples with Python
Includes many larger, fully worked out examples, this book provides a complete and modern guide to web scraping, using Python as the programming language, without glossing over important details or best practices. --
Saved in:
Main Authors: | , |
---|---|
Format: | eBook |
Language: | English |
Published: |
[Berkeley, California] :
Apress,
[2018]
|
Subjects: | |
ISBN: | 9781484235829 1484235827 9781484235812 |
Physical Description: | 1 online resource (xvi, 306 pages) : illustrations (some color) |
LEADER | 04412cam a2200421 i 4500 | ||
---|---|---|---|
001 | kn-on1032303167 | ||
003 | OCoLC | ||
005 | 20240717213016.0 | ||
006 | m o d | ||
007 | cr cn||||||||| | ||
008 | 180425s2018 caua o 001 0 eng d | ||
040 | |a N$T |b eng |e rda |e pn |c N$T |d N$T |d GW5XE |d EBLCP |d COO |d UAB |d OCLCF |d OCLCQ |d YDX |d U3W |d SNK |d G3B |d LVT |d C6I |d UKMGB |d K6U |d UMR |d CAUOI |d D6H |d MERER |d OCLCQ |d UKAHL |d OCLCQ |d OCLCO |d OCLCQ |d BRF |d OCLCQ |d TEFOD |d COM |d OCLCO |d OCL |d FTB |d OCLCQ |d LVB |d OCLCO |d OCLCL | ||
020 | |a 9781484235829 |q electronic book | ||
020 | |a 1484235827 |q electronic book | ||
020 | |z 9781484235812 | ||
035 | |a (OCoLC)1032303167 | ||
100 | 1 | |a Broucke, Seppe vanden, |d 1986- |e author. |1 https://id.oclc.org/worldcat/entity/E39PCjCj6vmfDkxTtjbtttFYRq | |
245 | 1 | 0 | |a Practical web scraping for data science : |b best practices and examples with Python / |c Seppe vanden Broucke, Bart Baesens. |
264 | 1 | |a [Berkeley, California] : |b Apress, |c [2018] | |
300 | |a 1 online resource (xvi, 306 pages) : |b illustrations (some color) | ||
336 | |a text |b txt |2 rdacontent | ||
337 | |a computer |b c |2 rdamedia | ||
338 | |a online resource |b cr |2 rdacarrier | ||
500 | |a Includes index. | ||
505 | 8 | |a Intro; Table of Contents; About the Authors; About the Technical Reviewer; Introduction; Part I: Web Scraping Basics; Chapter 1: Introduction; 1.1 What Is Web Scraping?; 1.1.1 Why Web Scraping for Data Science?; 1.1.2 Who Is Using Web Scraping?; 1.2 Getting Ready; 1.2.1 Setting Up; 1.2.2 A Quick Python Primer; Chapter 2: The Web Speaks HTTP; 2.1 The Magic of Networking; 2.2 The HyperText Transfer Protocol: HTTP; 2.3 HTTP in Python: The Requests Library; 2.4 Query Strings: URLs with Parameters; Chapter 3: Stirring the HTML and CSS Soup; 3.1 Hypertext Markup Language: HTML | |
505 | 8 | |a 3.2 Using Your Browser as a Development Tool3.3 Cascading Style Sheets: CSS; 3.4 The Beautiful Soup Library; 3.5 More on Beautiful Soup; Part II: Advanced Web Scraping; Chapter 4: Delving Deeper in HTTP; 4.1 Working with Forms and POST Requests; 4.2 Other HTTP Request Methods; 4.3 More on Headers; 4.4 Dealing with Cookies; 4.5 Using Sessions with Requests; 4.6 Binary, JSON, and Other Forms of Content; Chapter 5: Dealing with JavaScript; 5.1 What Is JavaScript?; 5.2 Scraping JavaScript; 5.3 Scraping with Selenium; 5.4 More on Selenium; Chapter 6: From Web Scraping to Web Crawling | |
505 | 8 | |a 6.1 What Is Web Crawling?6.2 Web Crawling in Python; 6.3 Storing Results in a Database; Part III: Managerial Concerns and Best Practices; Chapter 7: Managerial and Legal Concerns; 7.1 The Data Science Process; 7.2 Where Does Web Scraping Fit In?; 7.3 Legal Concerns; Chapter 8: Closing Topics; 8.1 Other Tools; 8.1.1 Alternative Python Libraries; 8.1.2 Scrapy; 8.1.3 Caching; 8.1.4 Proxy Servers; 8.1.5 Scraping in Other Programming Languages; 8.1.6 Command-Line Tools; 8.1.7 Graphical Scraping Tools; 8.2 Best Practices and Tips; Chapter 9: Examples; 9.1 Scraping Hacker News | |
505 | 8 | |a 9.2 Using the Hacker News API9.3 Quotes to Scrape; 9.4 Books to Scrape; 9.5 Scraping GitHub Stars; 9.6 Scraping Mortgage Rates; 9.7 Scraping and Visualizing IMDB Ratings; 9.8 Scraping IATA Airline Information; 9.9 Scraping and Analyzing Web Forum Interactions; 9.10 Collecting and Clustering a Fashion Data Set; 9.11 Sentiment Analysis of Scraped Amazon Reviews; 9.12 Scraping and Analyzing News Articles; 9.13 Scraping and Analyzing a Wikipedia Graph; 9.14 Scraping and Visualizing a Board Members Graph; 9.15 Breaking CAPTCHA's Using Deep Learning; Index | |
506 | |a Plný text je dostupný pouze z IP adres počítačů Univerzity Tomáše Bati ve Zlíně nebo vzdáleným přístupem pro zaměstnance a studenty | ||
520 | |a Includes many larger, fully worked out examples, this book provides a complete and modern guide to web scraping, using Python as the programming language, without glossing over important details or best practices. -- |c Edited summary from book. | ||
590 | |a Knovel |b Knovel (All titles) | ||
650 | 0 | |a Python (Computer program language) | |
650 | 0 | |a Data mining. | |
650 | 0 | |a Automatic data collection systems. | |
655 | 7 | |a elektronické knihy |7 fd186907 |2 czenas | |
655 | 9 | |a electronic books |2 eczenas | |
700 | 1 | |a Baesens, Bart, |e author. | |
856 | 4 | 0 | |u https://proxy.k.utb.cz/login?url=https://app.knovel.com/hotlink/toc/id:kpPWSDSBP2/practical-web-scraping?kpromoter=marc |y Full text |