Layout-based computation of web page similarity ranks

•A novel web page layout similarity measuring and ranking approach is proposed•Visual layout of a web page is enabled to be used as a query item•As a representation method, wireframes are proposed and verified•Form elements, animations and whitespaces are introduced as structural features•A novel an...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of human-computer studies Vol. 110; pp. 95 - 114
Main Authors Bozkir, Ahmet Selman, Akcapinar Sezer, Ebru
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.02.2018
Subjects
Online AccessGet full text
ISSN1071-5819
1095-9300
DOI10.1016/j.ijhcs.2017.10.008

Cover

Abstract •A novel web page layout similarity measuring and ranking approach is proposed•Visual layout of a web page is enabled to be used as a query item•As a representation method, wireframes are proposed and verified•Form elements, animations and whitespaces are introduced as structural features•A novel and verified SWPS-40 dataset is introduced for web page layout similarity related further studies•Experiments show that use of wireframe representation along with structural and vision based features to the web page visual similarity comparison is highly reasonable In this paper, we propose a ranking approach which considers visual similarities among web pages by using structure and vision-based features. Throughout the study, we aim to understand and represent the web page visual structure as in the way people do by focusing on the layout similarity through the wireframe design. The conducted study is composed of two parts. In the first part, structural similarities are analyzed with the proposed concept of “layout components” along with visual inspection of DOM trees. In this way, five types of structural layout components are proposed and revealed. Moreover, whitespaces are also utilized since they are important visual cues in the visual perception of web pages. In the second part, a computer-vision based method named histogram of oriented gradients (HOG) is employed to reveal local visual cues in terms of edge orientations. Following the feature extraction phases, extracted feature histograms are mapped on spatial information preserving multilevel and multi-resolution bag of features representation method named spatial pyramid matching. In this way, three goals were achieved: (1) the visual layout of web pages were mapped and compared in a multi-resolution schema; (2) the intermediate process of visual segmentation was removed; and (3) efficient and easily comparable web page layout signatures were generated. We also conducted a questionnaire study covering 312 subjects. This helped us to create a benchmark dataset involving similarity scores collected from individuals. So far, there exists no web page layout similarity ranking oriented corpus in the literature. Our suggested approach achieved a remarkable ranking performance at top-5 and top-10 retrieval results. According to the findings of the comparative study, our approach outperforms some structure and vision-based studies in the literature. With this achievement, web pages could be employed as a query item to find other, similar web pages by taking into consideration that they are web pages, instead of images or anything else. [Display omitted]
AbstractList •A novel web page layout similarity measuring and ranking approach is proposed•Visual layout of a web page is enabled to be used as a query item•As a representation method, wireframes are proposed and verified•Form elements, animations and whitespaces are introduced as structural features•A novel and verified SWPS-40 dataset is introduced for web page layout similarity related further studies•Experiments show that use of wireframe representation along with structural and vision based features to the web page visual similarity comparison is highly reasonable In this paper, we propose a ranking approach which considers visual similarities among web pages by using structure and vision-based features. Throughout the study, we aim to understand and represent the web page visual structure as in the way people do by focusing on the layout similarity through the wireframe design. The conducted study is composed of two parts. In the first part, structural similarities are analyzed with the proposed concept of “layout components” along with visual inspection of DOM trees. In this way, five types of structural layout components are proposed and revealed. Moreover, whitespaces are also utilized since they are important visual cues in the visual perception of web pages. In the second part, a computer-vision based method named histogram of oriented gradients (HOG) is employed to reveal local visual cues in terms of edge orientations. Following the feature extraction phases, extracted feature histograms are mapped on spatial information preserving multilevel and multi-resolution bag of features representation method named spatial pyramid matching. In this way, three goals were achieved: (1) the visual layout of web pages were mapped and compared in a multi-resolution schema; (2) the intermediate process of visual segmentation was removed; and (3) efficient and easily comparable web page layout signatures were generated. We also conducted a questionnaire study covering 312 subjects. This helped us to create a benchmark dataset involving similarity scores collected from individuals. So far, there exists no web page layout similarity ranking oriented corpus in the literature. Our suggested approach achieved a remarkable ranking performance at top-5 and top-10 retrieval results. According to the findings of the comparative study, our approach outperforms some structure and vision-based studies in the literature. With this achievement, web pages could be employed as a query item to find other, similar web pages by taking into consideration that they are web pages, instead of images or anything else. [Display omitted]
Author Akcapinar Sezer, Ebru
Bozkir, Ahmet Selman
Author_xml – sequence: 1
  givenname: Ahmet Selman
  surname: Bozkir
  fullname: Bozkir, Ahmet Selman
  email: selman@cs.hacettepe.edu.tr
– sequence: 2
  givenname: Ebru
  surname: Akcapinar Sezer
  fullname: Akcapinar Sezer, Ebru
BookMark eNqFz81Kw0AQwPFFKthWn8BLXiBxJpuPzcGDFL8g4EXPy3Yy0Y1tUna3St_exHryoKcZBv4Dv4WY9UPPQlwiJAhYXHWJ7d7IJylgOV4SAHUi5ghVHlcSYDbtJca5wupMLLzvAKDMAOYir81h2Id4bTw3EQ3b3T6YYIc-Gtrok9fRzrxy5O3Wboyz4RA507_7c3Hamo3ni5-5FC93t8-rh7h-un9c3dQxSZAhptSkqIoKFWOumJgKNMSpzGSTNWWRFRWkilnJVpUolWGFGa6J0kyRoVIuhTz-JTd477jVO2e3xh00gp7kutPfcj3Jp-MoH6vqV0X2iArO2M0_7fWx5ZH1YdlpT5Z74sY6pqCbwf7ZfwHulXeP
CitedBy_id crossref_primary_10_1109_ACCESS_2018_2878897
crossref_primary_10_1145_3326457
crossref_primary_10_3233_JIFS_210246
crossref_primary_10_1109_ACCESS_2023_3242549
crossref_primary_10_1109_TBDATA_2020_2963982
crossref_primary_10_1177_23197145231217550
crossref_primary_10_1109_MS_2020_2987044
crossref_primary_10_1016_j_ipm_2021_102767
Cites_doi 10.1145/2470654.2481281
10.1016/j.ipm.2007.02.003
10.1080/0144929X.2011.642893
10.1016/j.infsof.2015.10.005
10.1016/j.datak.2006.01.001
10.1016/S0010-0277(02)00184-1
10.1016/j.asoc.2014.09.051
10.1080/17445760802429585
10.1207/s15516709cog0702_3
10.1109/TDSC.2006.50
10.1109/TPAMI.2009.154
10.1145/1754393.1754394
10.1007/11744023_32
10.1007/978-3-642-10687-3_13
10.1007/s00422-013-0569-z
10.1023/B:VISI.0000029664.99615.94
10.1145/324133.324140
10.1111/j.1468-0394.2005.00302.x
ContentType Journal Article
Copyright 2017 Elsevier Ltd
Copyright_xml – notice: 2017 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.ijhcs.2017.10.008
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1095-9300
EndPage 114
ExternalDocumentID 10_1016_j_ijhcs_2017_10_008
S1071581917301465
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29J
4.4
457
4G.
53G
5GY
5VS
6TJ
7-5
71M
77K
8P~
9JN
9JO
AABNK
AACTN
AADFP
AAEDT
AAEDW
AAFJI
AAGJA
AAGUQ
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABIVO
ABJNI
ABMAC
ABMMH
ABOYX
ABXDB
ABYKQ
ACDAQ
ACGFS
ACHQT
ACNNM
ACRLP
ACXNI
ACZNC
ADBBV
ADEZE
ADFGL
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AETEA
AFKWA
AFTJW
AFYLN
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AI.
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
AKYCK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOMHK
AOUOD
ASPBG
AVARZ
AVWKF
AXJTR
AZFZN
B-7
BJAXD
BKOJK
BLXMC
CAG
COF
CS3
DM4
DU5
EBS
EFBJH
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
GBOLZ
HLZ
HMW
HVGLF
HZ~
IHE
J1W
JJJVA
K-O
KOM
LG5
LX9
M41
MO0
MVM
N9A
O-L
O9-
OAUVE
OKEIE
OZT
P-8
P-9
P2P
PC.
PRBVW
Q38
R2-
RIG
RNS
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SEW
SPC
SPS
SSB
SSO
SST
SSV
SSY
SSZ
T5K
VH1
WH7
WUQ
XJE
XPP
ZMT
ZU3
~G-
77I
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c303t-c2a2186918e158ecec61ace2343d4d76469028ee83f87138ae8141bcc248cac73
IEDL.DBID .~1
ISSN 1071-5819
IngestDate Wed Oct 01 02:52:12 EDT 2025
Thu Apr 24 23:08:21 EDT 2025
Fri Feb 23 02:30:17 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Bag of features
Histogram of oriented gradients
Web page layout
Similarity ranking
Layout similarity
Spatial pyramid matching
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c303t-c2a2186918e158ecec61ace2343d4d76469028ee83f87138ae8141bcc248cac73
PageCount 20
ParticipantIDs crossref_primary_10_1016_j_ijhcs_2017_10_008
crossref_citationtrail_10_1016_j_ijhcs_2017_10_008
elsevier_sciencedirect_doi_10_1016_j_ijhcs_2017_10_008
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate February 2018
2018-02-00
PublicationDateYYYYMMDD 2018-02-01
PublicationDate_xml – month: 02
  year: 2018
  text: February 2018
PublicationDecade 2010
PublicationTitle International journal of human-computer studies
PublicationYear 2018
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Bosch, Zisserman, Munoz (bib0005) 2007
Law, Gutierrez, Thome, Gançarski (bib0029) 2012
Kleinberg (bib0026) 1999; 46
Song, Liu, Wen, Ma (bib0046) 2004
OpenCV, (2015). “OpenCV Project”
Lazebnik, Schmid, Ponce (bib0030) 2006
Song (bib0045) 2011; 27
W3C, (2015), “World wide web consortium”
Chatzichristofis, Boutalis (bib0010) 2008
Cao, Mao, Luo (bib0009) 2010; 25
Sivic, Zisserman (bib0044) 2003
Gentner (bib0021) 1983; 7
Bay, H., Tuytelaars, T., Van Gool, L. (2006). SURF: speeded up robust features, ICCV’06.
Zeng, Flaganan, Hirokawa (bib0052) 2013
11.8. 2016
Hahn, Chater, Richardson (bib0022) 2003; 87
ASP.NET, 2015, “ASP.NET platform”
Martine, Rugg (bib0034) 2005; 22
.
Dehmer, Streib, Mehler, Kilian (bib0015) 2006; 1
Pehlivan, Ben Saad, Gançarkski (bib0039) 2010
Fu, Wenyin, Deng (bib0019) 2006; 3
Tombros, Ali (bib0048) 2005
(16.9.2015).
Flesca, Manco, Masciari, Pontieri, Pugliese (bib0018) 2007; 60
Zhang, Lu, Xu (bib0053) 2013; 37
Hara, Yamada, Miyake (bib0023) 2009
Chen, Dick, Miller (bib0012) 2010; 10
Ramon, Cuadrado, Molina, Vanderdonckt (bib0040) 2016; 70
Reinecke, Yeh, Miratrix, Mardiko, Zhao, Liu, Gajos (bib0041) 2013
Bartik (bib0003) 2012
Lindeberg (bib0032) 2013; 107
Liang, Yuang (bib0031) 2015; 28
Dorner (bib0017) 1997
Takama, Mistsuhashi (bib0047) 2005
Alpuente, Romero (bib0001) 2010
(30.8.2015).
Lowe (bib0033) 2004; 60
(17.9.2015).
DMCA, 2016, “Digital millenium copyright act”
O'Hara, S., & Draper, B.-A. (2011). Introduction to the bag of features paradigm for image classification
GeckoFX, 2015, “Mozilla GeckoFx.Net”
Robins, Holmes (bib0042) 2008; 44
Cai, Yu, Wen, Ma (bib0008) 2004
Michailidou, Harper, Bechofer (bib0035) 2008
Rosiello, Kirda, Kruegel, Ferrandi (bib0043) 2007
Kudelka, Takama, Snasel, Klos, Pokorny (bib0028) 2010; 67
Chen, Chen, Huang, Chen (bib0013) 2009
Kang, Choi (bib0025) 2008; 14
Joshi, Liu (bib0024) 2009
van de Sande, Gevers, Snoek (bib0050) 2010; 32
van der Geest, Loorbach (bib0049) 2005; 52
Dalal, Triggs (bib0014) 2005
Bozkir, Sezer (bib0006) 2014
Möller, Brezing, Unz (bib0036) 2012; 31
Cai, D., Yu, S., Wen, J.-R., & Ma, W.-Y. (2003). VIPS: a vision based pages segmentation algorithm, Technical Report MSR-TR-2003-79, Microsoft Research.
Chatzichristofis, Boutalis (bib0011) 2008
Lindeberg (10.1016/j.ijhcs.2017.10.008_bib0032) 2013; 107
Martine (10.1016/j.ijhcs.2017.10.008_bib0034) 2005; 22
Bosch (10.1016/j.ijhcs.2017.10.008_bib0005) 2007
van de Sande (10.1016/j.ijhcs.2017.10.008_bib0050) 2010; 32
Zeng (10.1016/j.ijhcs.2017.10.008_bib0052) 2013
10.1016/j.ijhcs.2017.10.008_bib0004
Lazebnik (10.1016/j.ijhcs.2017.10.008_bib0030) 2006
10.1016/j.ijhcs.2017.10.008_bib0007
Hara (10.1016/j.ijhcs.2017.10.008_bib0023) 2009
Chatzichristofis (10.1016/j.ijhcs.2017.10.008_bib0011) 2008
10.1016/j.ijhcs.2017.10.008_bib0002
Law (10.1016/j.ijhcs.2017.10.008_bib0029) 2012
Takama (10.1016/j.ijhcs.2017.10.008_bib0047) 2005
Kudelka (10.1016/j.ijhcs.2017.10.008_bib0028) 2010; 67
Flesca (10.1016/j.ijhcs.2017.10.008_bib0018) 2007; 60
van der Geest (10.1016/j.ijhcs.2017.10.008_bib0049) 2005; 52
Bozkir (10.1016/j.ijhcs.2017.10.008_bib0006) 2014
Dalal (10.1016/j.ijhcs.2017.10.008_bib0014) 2005
Fu (10.1016/j.ijhcs.2017.10.008_bib0019) 2006; 3
Reinecke (10.1016/j.ijhcs.2017.10.008_bib0041) 2013
Alpuente (10.1016/j.ijhcs.2017.10.008_bib0001) 2010
Cai (10.1016/j.ijhcs.2017.10.008_bib0008) 2004
Hahn (10.1016/j.ijhcs.2017.10.008_bib0022) 2003; 87
Chatzichristofis (10.1016/j.ijhcs.2017.10.008_bib0010) 2008
10.1016/j.ijhcs.2017.10.008_bib0037
Chen (10.1016/j.ijhcs.2017.10.008_bib0012) 2010; 10
10.1016/j.ijhcs.2017.10.008_bib0038
Pehlivan (10.1016/j.ijhcs.2017.10.008_bib0039) 2010
Dehmer (10.1016/j.ijhcs.2017.10.008_bib0015) 2006; 1
Möller (10.1016/j.ijhcs.2017.10.008_bib0036) 2012; 31
Liang (10.1016/j.ijhcs.2017.10.008_bib0031) 2015; 28
Gentner (10.1016/j.ijhcs.2017.10.008_bib0021) 1983; 7
Sivic (10.1016/j.ijhcs.2017.10.008_bib0044) 2003
Dorner (10.1016/j.ijhcs.2017.10.008_bib0017) 1997
Rosiello (10.1016/j.ijhcs.2017.10.008_bib0043) 2007
Kang (10.1016/j.ijhcs.2017.10.008_bib0025) 2008; 14
Ramon (10.1016/j.ijhcs.2017.10.008_bib0040) 2016; 70
Bartik (10.1016/j.ijhcs.2017.10.008_bib0003) 2012
10.1016/j.ijhcs.2017.10.008_bib0020
Robins (10.1016/j.ijhcs.2017.10.008_bib0042) 2008; 44
Song (10.1016/j.ijhcs.2017.10.008_bib0045) 2011; 27
Chen (10.1016/j.ijhcs.2017.10.008_bib0013) 2009
Zhang (10.1016/j.ijhcs.2017.10.008_bib0053) 2013; 37
Kleinberg (10.1016/j.ijhcs.2017.10.008_bib0026) 1999; 46
Joshi (10.1016/j.ijhcs.2017.10.008_bib0024) 2009
Tombros (10.1016/j.ijhcs.2017.10.008_bib0048) 2005
Song (10.1016/j.ijhcs.2017.10.008_bib0046) 2004
10.1016/j.ijhcs.2017.10.008_bib0016
Cao (10.1016/j.ijhcs.2017.10.008_bib0009) 2010; 25
10.1016/j.ijhcs.2017.10.008_bib0051
Michailidou (10.1016/j.ijhcs.2017.10.008_bib0035) 2008
Lowe (10.1016/j.ijhcs.2017.10.008_bib0033) 2004; 60
References_xml – start-page: 1
  year: 2012
  end-page: 6
  ident: bib0029
  article-title: Structural and visual similarity learning for web page archiving
  publication-title: Content-Based Multimedia Indexing (CBMI), 2012 10th International Workshop
– volume: 1
  start-page: 3057
  year: 2006
  end-page: 3063
  ident: bib0015
  article-title: Measuring the structural similarity of web-based documents: a novel approach
  publication-title: Int. J. Comput. Intelligence
– volume: 107
  start-page: 589
  year: 2013
  end-page: 635
  ident: bib0032
  article-title: A computational theory of visual receptive fields
  publication-title: Biolog. Cybern.
– year: 2004
  ident: bib0008
  article-title: Block-based web search
  publication-title: Conference of SIGIR’04
– year: 2008
  ident: bib0011
  article-title: FCTH: fuzzy color and texture histogram - a low level feature for accurate image retrieval
  publication-title: WIAMIS '08
– reference: GeckoFX, 2015, “Mozilla GeckoFx.Net”,
– start-page: 30
  year: 2009
  end-page: 36
  ident: bib0023
  article-title: Visual similarity-based phishing detection without victim site information
  publication-title: Proc. CICS’09 IEEE Symposium
– volume: 52
  start-page: 27
  year: 2005
  end-page: 36
  ident: bib0049
  article-title: Testing the visual consistency of web sites
  publication-title: Tech. Commun.
– start-page: 45
  year: 2010
  end-page: 51
  ident: bib0001
  article-title: A tool for computing visual similarity of web pages
  publication-title: Proc. 10th Annual International Symposium on Applications and the Internet
– start-page: 1470
  year: 2003
  end-page: 1477
  ident: bib0044
  article-title: Video google: a text retrieval approach to object matching in videos
  publication-title: ICCV
– reference: (30.8.2015).
– reference: (17.9.2015).
– start-page: 312
  year: 2008
  end-page: 322
  ident: bib0010
  article-title: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval
  publication-title: ICVS 2008
– start-page: 1
  year: 2010
  end-page: 15
  ident: bib0039
  article-title: Vi-DIFF: understanding web pages changes
  publication-title: 21st International Conference, DEXA 2010, Bilbao
– volume: 22
  start-page: 115
  year: 2005
  end-page: 120
  ident: bib0034
  article-title: That site looks 88.46% familiar: quantifying similarity of web page design
  publication-title: Expert Syst.
– volume: 37
  start-page: 231
  year: 2013
  end-page: 244
  ident: bib0053
  article-title: Web phishing detection based on page spatial layout similarity
  publication-title: Informatica
– start-page: 487
  year: 2005
  end-page: 501
  ident: bib0048
  article-title: Factors affecting web page similarity
  publication-title: Proc. ECIR 2005
– volume: 27
  start-page: 793
  year: 2011
  end-page: 816
  ident: bib0045
  article-title: the role of structure and content in perception of visual similarity between web pages
  publication-title: Int. J. Human-Comput. Interaction
– reference: DMCA, 2016, “Digital millenium copyright act”,
– volume: 7
  start-page: 155
  year: 1983
  end-page: 170
  ident: bib0021
  article-title: Structure-mapping: a theoritcal framework for analogy
  publication-title: Cognitive Sci.
– year: 2005
  ident: bib0047
  article-title: Visual similarity comparison for web page retrieval,
  publication-title: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
– start-page: 2049
  year: 2013
  end-page: 2058
  ident: bib0041
  article-title: Predicting users’ first impressions of website aesthetics with a quantification of perceived visual complexity and colorfulness
  publication-title: Proc. SIGCHI Conf. Human Factors Comput. Syst.
– start-page: 2005
  year: 2005
  ident: bib0014
  article-title: Histogram of oriented gradients for human detection
  publication-title: IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognit.
– volume: 25
  start-page: 93
  year: 2010
  end-page: 104
  ident: bib0009
  article-title: Segmentation method for web page analysis using shrinking and dividing
  publication-title: Int. J. Parallel, Emergent Distrib. Syst.
– year: 2008
  ident: bib0035
  article-title: Visual complexity and aesthetic perception of web pages
  publication-title: SIGDOC’08
– start-page: 65
  year: 2013
  end-page: 69
  ident: bib0052
  article-title: Layout-tree-based approach for identifying visually similar blocks in a web page
  publication-title: 12th International Conference on Computer and Information Science
– reference: W3C, (2015), “World wide web consortium”,
– volume: 60
  start-page: 91
  year: 2004
  end-page: 110
  ident: bib0033
  article-title: Distinctive image features from scale invariant keypoints
  publication-title: IJCV
– volume: 14
  start-page: 1893
  year: 2008
  end-page: 1910
  ident: bib0025
  article-title: Recognizing informative web page blocks using visual segmentation for efficient information extraction
  publication-title: J. Univ. Comput. Sci.
– reference: Cai, D., Yu, S., Wen, J.-R., & Ma, W.-Y. (2003). VIPS: a vision based pages segmentation algorithm, Technical Report MSR-TR-2003-79, Microsoft Research.
– volume: 44
  start-page: 386
  year: 2008
  end-page: 399
  ident: bib0042
  article-title: Aesthetics and credibility in web site design
  publication-title: Inf. Process. Manage.
– volume: 32
  start-page: 1582
  year: 2010
  end-page: 1596
  ident: bib0050
  article-title: Evaluating color descriptors for object and scene recognition
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– reference: Bay, H., Tuytelaars, T., Van Gool, L. (2006). SURF: speeded up robust features, ICCV’06.
– start-page: 2169
  year: 2006
  end-page: 2178
  ident: bib0030
  article-title: Beyond bags of features: spatial pyramid matching recognizing natural scene categories
  publication-title: Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition
– reference: (11.8. 2016)
– reference: O'Hara, S., & Draper, B.-A. (2011). Introduction to the bag of features paradigm for image classification,
– volume: 31
  start-page: 739
  year: 2012
  end-page: 751
  ident: bib0036
  article-title: What should a corporate website look like? The influence of gestalt principles and visualisation in website design on the degree of acceptance and recommendation
  publication-title: Behav. Inf. Technol.
– start-page: 457
  year: 2014
  end-page: 470
  ident: bib0006
  article-title: Similay: A developing web page layout based visual similarity search engine
  publication-title: Proc. 10th International Conference of Machine Learning and Data Mining
– start-page: 13
  year: 2012
  end-page: 21
  ident: bib0003
  article-title: Measuring web page similarity based on textual and visual properties
  publication-title: . LNCS
– reference: , (16.9.2015).
– volume: 3
  start-page: 301
  year: 2006
  end-page: 311
  ident: bib0019
  article-title: Detecting phishing pages with visual similarity assessment based on earth mover's distance
  publication-title: IEEE Trans. Dependable Secure Computi.
– volume: 67
  start-page: 135
  year: 2010
  end-page: 146
  ident: bib0028
  article-title: Visual similarities of web pages
  publication-title: , LNCS
– volume: 28
  start-page: 483
  year: 2015
  end-page: 497
  ident: bib0031
  article-title: Moving object classification using local shape and hog features in wavelet-transformed space with hierarchical svm classifiers
  publication-title: Applied Soft Comput.
– volume: 87
  start-page: 1
  year: 2003
  end-page: 32
  ident: bib0022
  article-title: Similarity as transformation
  publication-title: Cognition
– volume: 46
  start-page: 604
  year: 1999
  end-page: 632
  ident: bib0026
  article-title: Authorative sources in hyperlinked environment
  publication-title: J. ACM
– volume: 70
  start-page: 155
  year: 2016
  end-page: 175
  ident: bib0040
  article-title: A layout inference algorithm for graphical user interfaces
  publication-title: Inf. Software Technol.
– start-page: 56
  year: 2009
  end-page: 63
  ident: bib0013
  article-title: Fighting phishing with discriminative key point features
  publication-title: Conference IEEE Internet Computing
– reference: .
– start-page: 16
  year: 2009
  end-page: 18
  ident: bib0024
  article-title: Web document text and image extraction using dom analysis and natural language processing
  publication-title: Proc. 9th ACM Symposium on Document Engineering
– start-page: 454
  year: 2007
  end-page: 463
  ident: bib0043
  article-title: A layout-similarity-based approach for detecting phishing pages
– volume: 10
  year: 2010
  ident: bib0012
  article-title: Detecting visually similar web pages: application to phishing detection
  publication-title: ACM Trans. Internet Technol.
– year: 1997
  ident: bib0017
  article-title: The Logic of Failure
– volume: 60
  start-page: 222
  year: 2007
  end-page: 234
  ident: bib0018
  article-title: Exploiting structural similarity for effective web information extraction
  publication-title: Data Knowl. Eng.
– start-page: 401
  year: 2007
  end-page: 408
  ident: bib0005
  article-title: Representing shape with a spatial pyramid Kernel
  publication-title: Conference of CIVR’07
– reference: OpenCV, (2015). “OpenCV Project”,
– reference: ASP.NET, 2015, “ASP.NET platform”,
– start-page: 203
  year: 2004
  end-page: 211
  ident: bib0046
  article-title: Learning block importance models for web pages
  publication-title: Proc. WWW’04 Conference
– ident: 10.1016/j.ijhcs.2017.10.008_bib0020
– start-page: 2049
  year: 2013
  ident: 10.1016/j.ijhcs.2017.10.008_bib0041
  article-title: Predicting users’ first impressions of website aesthetics with a quantification of perceived visual complexity and colorfulness
  publication-title: Proc. SIGCHI Conf. Human Factors Comput. Syst.
  doi: 10.1145/2470654.2481281
– year: 2008
  ident: 10.1016/j.ijhcs.2017.10.008_bib0011
  article-title: FCTH: fuzzy color and texture histogram - a low level feature for accurate image retrieval
– volume: 14
  start-page: 1893
  issue: 11
  year: 2008
  ident: 10.1016/j.ijhcs.2017.10.008_bib0025
  article-title: Recognizing informative web page blocks using visual segmentation for efficient information extraction
  publication-title: J. Univ. Comput. Sci.
– start-page: 2169
  year: 2006
  ident: 10.1016/j.ijhcs.2017.10.008_bib0030
  article-title: Beyond bags of features: spatial pyramid matching recognizing natural scene categories
– volume: 44
  start-page: 386
  year: 2008
  ident: 10.1016/j.ijhcs.2017.10.008_bib0042
  article-title: Aesthetics and credibility in web site design
  publication-title: Inf. Process. Manage.
  doi: 10.1016/j.ipm.2007.02.003
– ident: 10.1016/j.ijhcs.2017.10.008_bib0037
– start-page: 56
  year: 2009
  ident: 10.1016/j.ijhcs.2017.10.008_bib0013
  article-title: Fighting phishing with discriminative key point features
– volume: 31
  start-page: 739
  issue: 7
  year: 2012
  ident: 10.1016/j.ijhcs.2017.10.008_bib0036
  article-title: What should a corporate website look like? The influence of gestalt principles and visualisation in website design on the degree of acceptance and recommendation
  publication-title: Behav. Inf. Technol.
  doi: 10.1080/0144929X.2011.642893
– start-page: 487
  year: 2005
  ident: 10.1016/j.ijhcs.2017.10.008_bib0048
  article-title: Factors affecting web page similarity
– volume: 70
  start-page: 155
  year: 2016
  ident: 10.1016/j.ijhcs.2017.10.008_bib0040
  article-title: A layout inference algorithm for graphical user interfaces
  publication-title: Inf. Software Technol.
  doi: 10.1016/j.infsof.2015.10.005
– volume: 37
  start-page: 231
  issue: 3
  year: 2013
  ident: 10.1016/j.ijhcs.2017.10.008_bib0053
  article-title: Web phishing detection based on page spatial layout similarity
  publication-title: Informatica
– start-page: 457
  year: 2014
  ident: 10.1016/j.ijhcs.2017.10.008_bib0006
  article-title: Similay: A developing web page layout based visual similarity search engine
– year: 2008
  ident: 10.1016/j.ijhcs.2017.10.008_bib0035
  article-title: Visual complexity and aesthetic perception of web pages
  publication-title: SIGDOC’08
– volume: 60
  start-page: 222
  issue: 1
  year: 2007
  ident: 10.1016/j.ijhcs.2017.10.008_bib0018
  article-title: Exploiting structural similarity for effective web information extraction
  publication-title: Data Knowl. Eng.
  doi: 10.1016/j.datak.2006.01.001
– start-page: 1
  year: 2012
  ident: 10.1016/j.ijhcs.2017.10.008_bib0029
  article-title: Structural and visual similarity learning for web page archiving
– volume: 87
  start-page: 1
  year: 2003
  ident: 10.1016/j.ijhcs.2017.10.008_bib0022
  article-title: Similarity as transformation
  publication-title: Cognition
  doi: 10.1016/S0010-0277(02)00184-1
– year: 1997
  ident: 10.1016/j.ijhcs.2017.10.008_bib0017
– start-page: 13
  year: 2012
  ident: 10.1016/j.ijhcs.2017.10.008_bib0003
  article-title: Measuring web page similarity based on textual and visual properties
– volume: 28
  start-page: 483
  year: 2015
  ident: 10.1016/j.ijhcs.2017.10.008_bib0031
  article-title: Moving object classification using local shape and hog features in wavelet-transformed space with hierarchical svm classifiers
  publication-title: Applied Soft Comput.
  doi: 10.1016/j.asoc.2014.09.051
– year: 2004
  ident: 10.1016/j.ijhcs.2017.10.008_bib0008
  article-title: Block-based web search
– start-page: 1470
  year: 2003
  ident: 10.1016/j.ijhcs.2017.10.008_bib0044
  article-title: Video google: a text retrieval approach to object matching in videos
– start-page: 312
  year: 2008
  ident: 10.1016/j.ijhcs.2017.10.008_bib0010
  article-title: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval
– volume: 25
  start-page: 93
  issue: 2
  year: 2010
  ident: 10.1016/j.ijhcs.2017.10.008_bib0009
  article-title: Segmentation method for web page analysis using shrinking and dividing
  publication-title: Int. J. Parallel, Emergent Distrib. Syst.
  doi: 10.1080/17445760802429585
– ident: 10.1016/j.ijhcs.2017.10.008_bib0051
– start-page: 2005
  year: 2005
  ident: 10.1016/j.ijhcs.2017.10.008_bib0014
  article-title: Histogram of oriented gradients for human detection
  publication-title: IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognit.
– volume: 7
  start-page: 155
  year: 1983
  ident: 10.1016/j.ijhcs.2017.10.008_bib0021
  article-title: Structure-mapping: a theoritcal framework for analogy
  publication-title: Cognitive Sci.
  doi: 10.1207/s15516709cog0702_3
– ident: 10.1016/j.ijhcs.2017.10.008_bib0016
– volume: 3
  start-page: 301
  issue: 4
  year: 2006
  ident: 10.1016/j.ijhcs.2017.10.008_bib0019
  article-title: Detecting phishing pages with visual similarity assessment based on earth mover's distance
  publication-title: IEEE Trans. Dependable Secure Computi.
  doi: 10.1109/TDSC.2006.50
– volume: 32
  start-page: 1582
  issue: 9
  year: 2010
  ident: 10.1016/j.ijhcs.2017.10.008_bib0050
  article-title: Evaluating color descriptors for object and scene recognition
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2009.154
– volume: 10
  issue: 2
  year: 2010
  ident: 10.1016/j.ijhcs.2017.10.008_bib0012
  article-title: Detecting visually similar web pages: application to phishing detection
  publication-title: ACM Trans. Internet Technol.
  doi: 10.1145/1754393.1754394
– start-page: 16
  year: 2009
  ident: 10.1016/j.ijhcs.2017.10.008_bib0024
  article-title: Web document text and image extraction using dom analysis and natural language processing
– volume: 52
  start-page: 27
  issue: 1
  year: 2005
  ident: 10.1016/j.ijhcs.2017.10.008_bib0049
  article-title: Testing the visual consistency of web sites
  publication-title: Tech. Commun.
– start-page: 454
  year: 2007
  ident: 10.1016/j.ijhcs.2017.10.008_bib0043
  article-title: A layout-similarity-based approach for detecting phishing pages
– ident: 10.1016/j.ijhcs.2017.10.008_bib0002
– ident: 10.1016/j.ijhcs.2017.10.008_bib0004
  doi: 10.1007/11744023_32
– volume: 67
  start-page: 135
  year: 2010
  ident: 10.1016/j.ijhcs.2017.10.008_bib0028
  article-title: Visual similarities of web pages
  publication-title: Adv. Intell. Web Master., LNCS
  doi: 10.1007/978-3-642-10687-3_13
– start-page: 1
  year: 2010
  ident: 10.1016/j.ijhcs.2017.10.008_bib0039
  article-title: Vi-DIFF: understanding web pages changes
– start-page: 65
  year: 2013
  ident: 10.1016/j.ijhcs.2017.10.008_bib0052
  article-title: Layout-tree-based approach for identifying visually similar blocks in a web page
– ident: 10.1016/j.ijhcs.2017.10.008_bib0038
– start-page: 203
  year: 2004
  ident: 10.1016/j.ijhcs.2017.10.008_bib0046
  article-title: Learning block importance models for web pages
– start-page: 401
  year: 2007
  ident: 10.1016/j.ijhcs.2017.10.008_bib0005
  article-title: Representing shape with a spatial pyramid Kernel
– year: 2005
  ident: 10.1016/j.ijhcs.2017.10.008_bib0047
  article-title: Visual similarity comparison for web page retrieval,
– volume: 107
  start-page: 589
  issue: 6
  year: 2013
  ident: 10.1016/j.ijhcs.2017.10.008_bib0032
  article-title: A computational theory of visual receptive fields
  publication-title: Biolog. Cybern.
  doi: 10.1007/s00422-013-0569-z
– volume: 60
  start-page: 91
  issue: 2
  year: 2004
  ident: 10.1016/j.ijhcs.2017.10.008_bib0033
  article-title: Distinctive image features from scale invariant keypoints
  publication-title: IJCV
  doi: 10.1023/B:VISI.0000029664.99615.94
– volume: 46
  start-page: 604
  issue: 5
  year: 1999
  ident: 10.1016/j.ijhcs.2017.10.008_bib0026
  article-title: Authorative sources in hyperlinked environment
  publication-title: J. ACM
  doi: 10.1145/324133.324140
– ident: 10.1016/j.ijhcs.2017.10.008_bib0007
– volume: 22
  start-page: 115
  issue: 3
  year: 2005
  ident: 10.1016/j.ijhcs.2017.10.008_bib0034
  article-title: That site looks 88.46% familiar: quantifying similarity of web page design
  publication-title: Expert Syst.
  doi: 10.1111/j.1468-0394.2005.00302.x
– volume: 27
  start-page: 793
  issue: 8
  year: 2011
  ident: 10.1016/j.ijhcs.2017.10.008_bib0045
  article-title: the role of structure and content in perception of visual similarity between web pages
  publication-title: Int. J. Human-Comput. Interaction
– start-page: 45
  year: 2010
  ident: 10.1016/j.ijhcs.2017.10.008_bib0001
  article-title: A tool for computing visual similarity of web pages
– volume: 1
  start-page: 3057
  issue: 10
  year: 2006
  ident: 10.1016/j.ijhcs.2017.10.008_bib0015
  article-title: Measuring the structural similarity of web-based documents: a novel approach
  publication-title: Int. J. Comput. Intelligence
– start-page: 30
  year: 2009
  ident: 10.1016/j.ijhcs.2017.10.008_bib0023
  article-title: Visual similarity-based phishing detection without victim site information
SSID ssj0007400
Score 2.298043
Snippet •A novel web page layout similarity measuring and ranking approach is proposed•Visual layout of a web page is enabled to be used as a query item•As a...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 95
SubjectTerms Bag of features
Histogram of oriented gradients
Layout similarity
Similarity ranking
Spatial pyramid matching
Web page layout
Title Layout-based computation of web page similarity ranks
URI https://dx.doi.org/10.1016/j.ijhcs.2017.10.008
Volume 110
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1095-9300
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0007400
  issn: 1071-5819
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Complete Freedom Collection [SCCMFC]
  customDbUrl:
  eissn: 1095-9300
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0007400
  issn: 1071-5819
  databaseCode: ACRLP
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals
  customDbUrl:
  eissn: 1095-9300
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0007400
  issn: 1071-5819
  databaseCode: AIKHN
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect (Elsevier)
  customDbUrl:
  eissn: 1095-9300
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0007400
  issn: 1071-5819
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1095-9300
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0007400
  issn: 1071-5819
  databaseCode: AKRWK
  dateStart: 19940101
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELaqssCAeIryqDww4jaOHdsdq4qqvLpApW6Rc3FEKmgrmg4s_HZsJ4EiIQbGWD4p-WLf3Sfd3YfQZcaMiAxoYo-DJDwBRhRLLVmx0TXqZcLQwDUKP4zFaMJvp9G0gQZ1L4wrq6x8f-nTvbeuVroVmt1lnncfLXGhkecbjhYI12jOuXQqBp2P7zIPyYNyIoGkxO2uJw_5Gq989gxuZjeVHV_ipX6PThsRZ7iHdqtUEffLt9lHDTM_QDsbAwQPUXSv3xfrgrhYlGLwCg0earzIsHWQ2LkLvMpfc0tgbb6NnUT76ghNhtdPgxGphBAI2AhTEAi1l46iytgvNmBAUA0mZJylPJXCUdxQGaNYZvkPU9ooymkCEHIFGiQ7Rs35Ym5OEA7sBWYsCYDKjDMtE8iMZtze6kxI6LEWCmsAYqimhDuxipe4LgebxR612KHmFi1qLXT1ZbQsh2T8vV3UyMY__nVs3fhfhqf_NTxD2_ZJldXW56hZvK3NhU0miqTtT0sbbfVv7kbjT3rYx8Y
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV09T8MwELVKGYAB8SnKpwdG3MaxE7sjqqgKtF1opW6Rc3FEKmgrmg4s_HZsJ4EiIQZWx5aSF_vdPen8DqHrlOkw0KCI2Q6C8BgYkSwxYsVE16Cdhpp69qLwYBj2xvxhEkxqqFPdhbFllSX3F5zu2LocaZVothZZ1noywoUGTm9YWRAGG2iTB76wCqz58V3nIbhXWBIISuz0ynrIFXll02ewpt1UNF2Nl_w9PK2FnO4e2i1zRXxbvM4-qunZAdpZcxA8REFfvc9XObHBKMHgWjQ4rPE8xYYhseULvMxeM6NgTcKNbY_25REad-9GnR4pOyEQMCEmJ-Ar1zuKSm0-WYOGkCrQPuMs4YkIrcb1pdaSpUYAMam0pJzGAD6XoECwY1SfzWf6BGHPnGDGYg-oSDlTIoZUK8bNsU5DAW3WQH4FQASlTbjtVvESVfVg08ihFlnU7KBBrYFuvhYtCpeMv6eHFbLRj58dGR7_a-Hpfxdeoa3eaNCP-vfDxzO0bZ7IovT6HNXzt5W-MJlFHl-6nfMJU-_JWw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Layout-based+computation+of+web+page+similarity+ranks&rft.jtitle=International+journal+of+human-computer+studies&rft.au=Bozkir%2C+Ahmet+Selman&rft.au=Akcapinar+Sezer%2C+Ebru&rft.date=2018-02-01&rft.pub=Elsevier+Ltd&rft.issn=1071-5819&rft.eissn=1095-9300&rft.volume=110&rft.spage=95&rft.epage=114&rft_id=info:doi/10.1016%2Fj.ijhcs.2017.10.008&rft.externalDocID=S1071581917301465
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1071-5819&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1071-5819&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1071-5819&client=summon