A Selection Algorithm for Focused Crawlers Incorporating Semantic Metadata

The search results offered currently by majority of search portals are horizontal by nature. This denotes that these search engines intend to index as much web pages as possible and present search results based on these web pages. These results often offer generalized results. Focused Crawlers were...

Full description

Saved in:
Bibliographic Details
Published inDistributed Computing and Internet Technology Vol. 7753; pp. 561 - 572
Main Authors Wadwekar, Saurabh, Mukhopadhyay, Debajyoti
Format Book Chapter
LanguageEnglish
Published Germany Springer Berlin / Heidelberg 2013
Springer Berlin Heidelberg
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN9783642360701
364236070X
ISSN0302-9743
1611-3349
DOI10.1007/978-3-642-36071-8_45

Cover

More Information
Summary:The search results offered currently by majority of search portals are horizontal by nature. This denotes that these search engines intend to index as much web pages as possible and present search results based on these web pages. These results often offer generalized results. Focused Crawlers were built to download web pages relevant only to a pre-specified topic. Searching on these kinds of pages is called as Vertical Search, as it attempts to drill down on a single topic, rather than exploring a plethora of other pages on web which are related to search query in one way or another. In this paper, we propose an algorithm which helps a focused crawler decide whether a web page should be downloaded on not. The selection algorithm proposed in this paper makes use of semantic properties of the content to arrive at a decision.
ISBN:9783642360701
364236070X
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-642-36071-8_45