Layout-based computation of web page similarity ranks
•A novel web page layout similarity measuring and ranking approach is proposed•Visual layout of a web page is enabled to be used as a query item•As a representation method, wireframes are proposed and verified•Form elements, animations and whitespaces are introduced as structural features•A novel an...
        Saved in:
      
    
          | Published in | International journal of human-computer studies Vol. 110; pp. 95 - 114 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
            Elsevier Ltd
    
        01.02.2018
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1071-5819 1095-9300  | 
| DOI | 10.1016/j.ijhcs.2017.10.008 | 
Cover
| Abstract | •A novel web page layout similarity measuring and ranking approach is proposed•Visual layout of a web page is enabled to be used as a query item•As a representation method, wireframes are proposed and verified•Form elements, animations and whitespaces are introduced as structural features•A novel and verified SWPS-40 dataset is introduced for web page layout similarity related further studies•Experiments show that use of wireframe representation along with structural and vision based features to the web page visual similarity comparison is highly reasonable
In this paper, we propose a ranking approach which considers visual similarities among web pages by using structure and vision-based features. Throughout the study, we aim to understand and represent the web page visual structure as in the way people do by focusing on the layout similarity through the wireframe design. The conducted study is composed of two parts. In the first part, structural similarities are analyzed with the proposed concept of “layout components” along with visual inspection of DOM trees. In this way, five types of structural layout components are proposed and revealed. Moreover, whitespaces are also utilized since they are important visual cues in the visual perception of web pages. In the second part, a computer-vision based method named histogram of oriented gradients (HOG) is employed to reveal local visual cues in terms of edge orientations. Following the feature extraction phases, extracted feature histograms are mapped on spatial information preserving multilevel and multi-resolution bag of features representation method named spatial pyramid matching. In this way, three goals were achieved: (1) the visual layout of web pages were mapped and compared in a multi-resolution schema; (2) the intermediate process of visual segmentation was removed; and (3) efficient and easily comparable web page layout signatures were generated. We also conducted a questionnaire study covering 312 subjects. This helped us to create a benchmark dataset involving similarity scores collected from individuals. So far, there exists no web page layout similarity ranking oriented corpus in the literature. Our suggested approach achieved a remarkable ranking performance at top-5 and top-10 retrieval results. According to the findings of the comparative study, our approach outperforms some structure and vision-based studies in the literature. With this achievement, web pages could be employed as a query item to find other, similar web pages by taking into consideration that they are web pages, instead of images or anything else.
[Display omitted] | 
    
|---|---|
| AbstractList | •A novel web page layout similarity measuring and ranking approach is proposed•Visual layout of a web page is enabled to be used as a query item•As a representation method, wireframes are proposed and verified•Form elements, animations and whitespaces are introduced as structural features•A novel and verified SWPS-40 dataset is introduced for web page layout similarity related further studies•Experiments show that use of wireframe representation along with structural and vision based features to the web page visual similarity comparison is highly reasonable
In this paper, we propose a ranking approach which considers visual similarities among web pages by using structure and vision-based features. Throughout the study, we aim to understand and represent the web page visual structure as in the way people do by focusing on the layout similarity through the wireframe design. The conducted study is composed of two parts. In the first part, structural similarities are analyzed with the proposed concept of “layout components” along with visual inspection of DOM trees. In this way, five types of structural layout components are proposed and revealed. Moreover, whitespaces are also utilized since they are important visual cues in the visual perception of web pages. In the second part, a computer-vision based method named histogram of oriented gradients (HOG) is employed to reveal local visual cues in terms of edge orientations. Following the feature extraction phases, extracted feature histograms are mapped on spatial information preserving multilevel and multi-resolution bag of features representation method named spatial pyramid matching. In this way, three goals were achieved: (1) the visual layout of web pages were mapped and compared in a multi-resolution schema; (2) the intermediate process of visual segmentation was removed; and (3) efficient and easily comparable web page layout signatures were generated. We also conducted a questionnaire study covering 312 subjects. This helped us to create a benchmark dataset involving similarity scores collected from individuals. So far, there exists no web page layout similarity ranking oriented corpus in the literature. Our suggested approach achieved a remarkable ranking performance at top-5 and top-10 retrieval results. According to the findings of the comparative study, our approach outperforms some structure and vision-based studies in the literature. With this achievement, web pages could be employed as a query item to find other, similar web pages by taking into consideration that they are web pages, instead of images or anything else.
[Display omitted] | 
    
| Author | Akcapinar Sezer, Ebru Bozkir, Ahmet Selman  | 
    
| Author_xml | – sequence: 1 givenname: Ahmet Selman surname: Bozkir fullname: Bozkir, Ahmet Selman email: selman@cs.hacettepe.edu.tr – sequence: 2 givenname: Ebru surname: Akcapinar Sezer fullname: Akcapinar Sezer, Ebru  | 
    
| BookMark | eNqFz81Kw0AQwPFFKthWn8BLXiBxJpuPzcGDFL8g4EXPy3Yy0Y1tUna3St_exHryoKcZBv4Dv4WY9UPPQlwiJAhYXHWJ7d7IJylgOV4SAHUi5ghVHlcSYDbtJca5wupMLLzvAKDMAOYir81h2Id4bTw3EQ3b3T6YYIc-Gtrok9fRzrxy5O3Wboyz4RA507_7c3Hamo3ni5-5FC93t8-rh7h-un9c3dQxSZAhptSkqIoKFWOumJgKNMSpzGSTNWWRFRWkilnJVpUolWGFGa6J0kyRoVIuhTz-JTd477jVO2e3xh00gp7kutPfcj3Jp-MoH6vqV0X2iArO2M0_7fWx5ZH1YdlpT5Z74sY6pqCbwf7ZfwHulXeP | 
    
| CitedBy_id | crossref_primary_10_1109_ACCESS_2018_2878897 crossref_primary_10_1145_3326457 crossref_primary_10_3233_JIFS_210246 crossref_primary_10_1109_ACCESS_2023_3242549 crossref_primary_10_1109_TBDATA_2020_2963982 crossref_primary_10_1177_23197145231217550 crossref_primary_10_1109_MS_2020_2987044 crossref_primary_10_1016_j_ipm_2021_102767  | 
    
| Cites_doi | 10.1145/2470654.2481281 10.1016/j.ipm.2007.02.003 10.1080/0144929X.2011.642893 10.1016/j.infsof.2015.10.005 10.1016/j.datak.2006.01.001 10.1016/S0010-0277(02)00184-1 10.1016/j.asoc.2014.09.051 10.1080/17445760802429585 10.1207/s15516709cog0702_3 10.1109/TDSC.2006.50 10.1109/TPAMI.2009.154 10.1145/1754393.1754394 10.1007/11744023_32 10.1007/978-3-642-10687-3_13 10.1007/s00422-013-0569-z 10.1023/B:VISI.0000029664.99615.94 10.1145/324133.324140 10.1111/j.1468-0394.2005.00302.x  | 
    
| ContentType | Journal Article | 
    
| Copyright | 2017 Elsevier Ltd | 
    
| Copyright_xml | – notice: 2017 Elsevier Ltd | 
    
| DBID | AAYXX CITATION  | 
    
| DOI | 10.1016/j.ijhcs.2017.10.008 | 
    
| DatabaseName | CrossRef | 
    
| DatabaseTitle | CrossRef | 
    
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Engineering | 
    
| EISSN | 1095-9300 | 
    
| EndPage | 114 | 
    
| ExternalDocumentID | 10_1016_j_ijhcs_2017_10_008 S1071581917301465  | 
    
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1RT 1~. 1~5 29J 4.4 457 4G. 53G 5GY 5VS 6TJ 7-5 71M 77K 8P~ 9JN 9JO AABNK AACTN AADFP AAEDT AAEDW AAFJI AAGJA AAGUQ AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABIVO ABJNI ABMAC ABMMH ABOYX ABXDB ABYKQ ACDAQ ACGFS ACHQT ACNNM ACRLP ACXNI ACZNC ADBBV ADEZE ADFGL ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AETEA AFKWA AFTJW AFYLN AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AI. AIALX AIEXJ AIKHN AITUG AJBFU AJOXV AKYCK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOMHK AOUOD ASPBG AVARZ AVWKF AXJTR AZFZN B-7 BJAXD BKOJK BLXMC CAG COF CS3 DM4 DU5 EBS EFBJH EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q G8K GBLVA GBOLZ HLZ HMW HVGLF HZ~ IHE J1W JJJVA K-O KOM LG5 LX9 M41 MO0 MVM N9A O-L O9- OAUVE OKEIE OZT P-8 P-9 P2P PC. PRBVW Q38 R2- RIG RNS ROL RPZ SBC SDF SDG SDP SES SEW SPC SPS SSB SSO SST SSV SSY SSZ T5K VH1 WH7 WUQ XJE XPP ZMT ZU3 ~G- 77I AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD  | 
    
| ID | FETCH-LOGICAL-c303t-c2a2186918e158ecec61ace2343d4d76469028ee83f87138ae8141bcc248cac73 | 
    
| IEDL.DBID | .~1 | 
    
| ISSN | 1071-5819 | 
    
| IngestDate | Wed Oct 01 02:52:12 EDT 2025 Thu Apr 24 23:08:21 EDT 2025 Fri Feb 23 02:30:17 EST 2024  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Keywords | Bag of features Histogram of oriented gradients Web page layout Similarity ranking Layout similarity Spatial pyramid matching  | 
    
| Language | English | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c303t-c2a2186918e158ecec61ace2343d4d76469028ee83f87138ae8141bcc248cac73 | 
    
| PageCount | 20 | 
    
| ParticipantIDs | crossref_primary_10_1016_j_ijhcs_2017_10_008 crossref_citationtrail_10_1016_j_ijhcs_2017_10_008 elsevier_sciencedirect_doi_10_1016_j_ijhcs_2017_10_008  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | February 2018 2018-02-00  | 
    
| PublicationDateYYYYMMDD | 2018-02-01 | 
    
| PublicationDate_xml | – month: 02 year: 2018 text: February 2018  | 
    
| PublicationDecade | 2010 | 
    
| PublicationTitle | International journal of human-computer studies | 
    
| PublicationYear | 2018 | 
    
| Publisher | Elsevier Ltd | 
    
| Publisher_xml | – name: Elsevier Ltd | 
    
| References | Bosch, Zisserman, Munoz (bib0005) 2007 Law, Gutierrez, Thome, Gançarski (bib0029) 2012 Kleinberg (bib0026) 1999; 46 Song, Liu, Wen, Ma (bib0046) 2004 OpenCV, (2015). “OpenCV Project” Lazebnik, Schmid, Ponce (bib0030) 2006 Song (bib0045) 2011; 27 W3C, (2015), “World wide web consortium” Chatzichristofis, Boutalis (bib0010) 2008 Cao, Mao, Luo (bib0009) 2010; 25 Sivic, Zisserman (bib0044) 2003 Gentner (bib0021) 1983; 7 Bay, H., Tuytelaars, T., Van Gool, L. (2006). SURF: speeded up robust features, ICCV’06. Zeng, Flaganan, Hirokawa (bib0052) 2013 11.8. 2016 Hahn, Chater, Richardson (bib0022) 2003; 87 ASP.NET, 2015, “ASP.NET platform” Martine, Rugg (bib0034) 2005; 22 . Dehmer, Streib, Mehler, Kilian (bib0015) 2006; 1 Pehlivan, Ben Saad, Gançarkski (bib0039) 2010 Fu, Wenyin, Deng (bib0019) 2006; 3 Tombros, Ali (bib0048) 2005 (16.9.2015). Flesca, Manco, Masciari, Pontieri, Pugliese (bib0018) 2007; 60 Zhang, Lu, Xu (bib0053) 2013; 37 Hara, Yamada, Miyake (bib0023) 2009 Chen, Dick, Miller (bib0012) 2010; 10 Ramon, Cuadrado, Molina, Vanderdonckt (bib0040) 2016; 70 Reinecke, Yeh, Miratrix, Mardiko, Zhao, Liu, Gajos (bib0041) 2013 Bartik (bib0003) 2012 Lindeberg (bib0032) 2013; 107 Liang, Yuang (bib0031) 2015; 28 Dorner (bib0017) 1997 Takama, Mistsuhashi (bib0047) 2005 Alpuente, Romero (bib0001) 2010 (30.8.2015). Lowe (bib0033) 2004; 60 (17.9.2015). DMCA, 2016, “Digital millenium copyright act” O'Hara, S., & Draper, B.-A. (2011). Introduction to the bag of features paradigm for image classification GeckoFX, 2015, “Mozilla GeckoFx.Net” Robins, Holmes (bib0042) 2008; 44 Cai, Yu, Wen, Ma (bib0008) 2004 Michailidou, Harper, Bechofer (bib0035) 2008 Rosiello, Kirda, Kruegel, Ferrandi (bib0043) 2007 Kudelka, Takama, Snasel, Klos, Pokorny (bib0028) 2010; 67 Chen, Chen, Huang, Chen (bib0013) 2009 Kang, Choi (bib0025) 2008; 14 Joshi, Liu (bib0024) 2009 van de Sande, Gevers, Snoek (bib0050) 2010; 32 van der Geest, Loorbach (bib0049) 2005; 52 Dalal, Triggs (bib0014) 2005 Bozkir, Sezer (bib0006) 2014 Möller, Brezing, Unz (bib0036) 2012; 31 Cai, D., Yu, S., Wen, J.-R., & Ma, W.-Y. (2003). VIPS: a vision based pages segmentation algorithm, Technical Report MSR-TR-2003-79, Microsoft Research. Chatzichristofis, Boutalis (bib0011) 2008 Lindeberg (10.1016/j.ijhcs.2017.10.008_bib0032) 2013; 107 Martine (10.1016/j.ijhcs.2017.10.008_bib0034) 2005; 22 Bosch (10.1016/j.ijhcs.2017.10.008_bib0005) 2007 van de Sande (10.1016/j.ijhcs.2017.10.008_bib0050) 2010; 32 Zeng (10.1016/j.ijhcs.2017.10.008_bib0052) 2013 10.1016/j.ijhcs.2017.10.008_bib0004 Lazebnik (10.1016/j.ijhcs.2017.10.008_bib0030) 2006 10.1016/j.ijhcs.2017.10.008_bib0007 Hara (10.1016/j.ijhcs.2017.10.008_bib0023) 2009 Chatzichristofis (10.1016/j.ijhcs.2017.10.008_bib0011) 2008 10.1016/j.ijhcs.2017.10.008_bib0002 Law (10.1016/j.ijhcs.2017.10.008_bib0029) 2012 Takama (10.1016/j.ijhcs.2017.10.008_bib0047) 2005 Kudelka (10.1016/j.ijhcs.2017.10.008_bib0028) 2010; 67 Flesca (10.1016/j.ijhcs.2017.10.008_bib0018) 2007; 60 van der Geest (10.1016/j.ijhcs.2017.10.008_bib0049) 2005; 52 Bozkir (10.1016/j.ijhcs.2017.10.008_bib0006) 2014 Dalal (10.1016/j.ijhcs.2017.10.008_bib0014) 2005 Fu (10.1016/j.ijhcs.2017.10.008_bib0019) 2006; 3 Reinecke (10.1016/j.ijhcs.2017.10.008_bib0041) 2013 Alpuente (10.1016/j.ijhcs.2017.10.008_bib0001) 2010 Cai (10.1016/j.ijhcs.2017.10.008_bib0008) 2004 Hahn (10.1016/j.ijhcs.2017.10.008_bib0022) 2003; 87 Chatzichristofis (10.1016/j.ijhcs.2017.10.008_bib0010) 2008 10.1016/j.ijhcs.2017.10.008_bib0037 Chen (10.1016/j.ijhcs.2017.10.008_bib0012) 2010; 10 10.1016/j.ijhcs.2017.10.008_bib0038 Pehlivan (10.1016/j.ijhcs.2017.10.008_bib0039) 2010 Dehmer (10.1016/j.ijhcs.2017.10.008_bib0015) 2006; 1 Möller (10.1016/j.ijhcs.2017.10.008_bib0036) 2012; 31 Liang (10.1016/j.ijhcs.2017.10.008_bib0031) 2015; 28 Gentner (10.1016/j.ijhcs.2017.10.008_bib0021) 1983; 7 Sivic (10.1016/j.ijhcs.2017.10.008_bib0044) 2003 Dorner (10.1016/j.ijhcs.2017.10.008_bib0017) 1997 Rosiello (10.1016/j.ijhcs.2017.10.008_bib0043) 2007 Kang (10.1016/j.ijhcs.2017.10.008_bib0025) 2008; 14 Ramon (10.1016/j.ijhcs.2017.10.008_bib0040) 2016; 70 Bartik (10.1016/j.ijhcs.2017.10.008_bib0003) 2012 10.1016/j.ijhcs.2017.10.008_bib0020 Robins (10.1016/j.ijhcs.2017.10.008_bib0042) 2008; 44 Song (10.1016/j.ijhcs.2017.10.008_bib0045) 2011; 27 Chen (10.1016/j.ijhcs.2017.10.008_bib0013) 2009 Zhang (10.1016/j.ijhcs.2017.10.008_bib0053) 2013; 37 Kleinberg (10.1016/j.ijhcs.2017.10.008_bib0026) 1999; 46 Joshi (10.1016/j.ijhcs.2017.10.008_bib0024) 2009 Tombros (10.1016/j.ijhcs.2017.10.008_bib0048) 2005 Song (10.1016/j.ijhcs.2017.10.008_bib0046) 2004 10.1016/j.ijhcs.2017.10.008_bib0016 Cao (10.1016/j.ijhcs.2017.10.008_bib0009) 2010; 25 10.1016/j.ijhcs.2017.10.008_bib0051 Michailidou (10.1016/j.ijhcs.2017.10.008_bib0035) 2008 Lowe (10.1016/j.ijhcs.2017.10.008_bib0033) 2004; 60  | 
    
| References_xml | – start-page: 1 year: 2012 end-page: 6 ident: bib0029 article-title: Structural and visual similarity learning for web page archiving publication-title: Content-Based Multimedia Indexing (CBMI), 2012 10th International Workshop – volume: 1 start-page: 3057 year: 2006 end-page: 3063 ident: bib0015 article-title: Measuring the structural similarity of web-based documents: a novel approach publication-title: Int. J. Comput. Intelligence – volume: 107 start-page: 589 year: 2013 end-page: 635 ident: bib0032 article-title: A computational theory of visual receptive fields publication-title: Biolog. Cybern. – year: 2004 ident: bib0008 article-title: Block-based web search publication-title: Conference of SIGIR’04 – year: 2008 ident: bib0011 article-title: FCTH: fuzzy color and texture histogram - a low level feature for accurate image retrieval publication-title: WIAMIS '08 – reference: GeckoFX, 2015, “Mozilla GeckoFx.Net”, – start-page: 30 year: 2009 end-page: 36 ident: bib0023 article-title: Visual similarity-based phishing detection without victim site information publication-title: Proc. CICS’09 IEEE Symposium – volume: 52 start-page: 27 year: 2005 end-page: 36 ident: bib0049 article-title: Testing the visual consistency of web sites publication-title: Tech. Commun. – start-page: 45 year: 2010 end-page: 51 ident: bib0001 article-title: A tool for computing visual similarity of web pages publication-title: Proc. 10th Annual International Symposium on Applications and the Internet – start-page: 1470 year: 2003 end-page: 1477 ident: bib0044 article-title: Video google: a text retrieval approach to object matching in videos publication-title: ICCV – reference: (30.8.2015). – reference: (17.9.2015). – start-page: 312 year: 2008 end-page: 322 ident: bib0010 article-title: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval publication-title: ICVS 2008 – start-page: 1 year: 2010 end-page: 15 ident: bib0039 article-title: Vi-DIFF: understanding web pages changes publication-title: 21st International Conference, DEXA 2010, Bilbao – volume: 22 start-page: 115 year: 2005 end-page: 120 ident: bib0034 article-title: That site looks 88.46% familiar: quantifying similarity of web page design publication-title: Expert Syst. – volume: 37 start-page: 231 year: 2013 end-page: 244 ident: bib0053 article-title: Web phishing detection based on page spatial layout similarity publication-title: Informatica – start-page: 487 year: 2005 end-page: 501 ident: bib0048 article-title: Factors affecting web page similarity publication-title: Proc. ECIR 2005 – volume: 27 start-page: 793 year: 2011 end-page: 816 ident: bib0045 article-title: the role of structure and content in perception of visual similarity between web pages publication-title: Int. J. Human-Comput. Interaction – reference: DMCA, 2016, “Digital millenium copyright act”, – volume: 7 start-page: 155 year: 1983 end-page: 170 ident: bib0021 article-title: Structure-mapping: a theoritcal framework for analogy publication-title: Cognitive Sci. – year: 2005 ident: bib0047 article-title: Visual similarity comparison for web page retrieval, publication-title: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence – start-page: 2049 year: 2013 end-page: 2058 ident: bib0041 article-title: Predicting users’ first impressions of website aesthetics with a quantification of perceived visual complexity and colorfulness publication-title: Proc. SIGCHI Conf. Human Factors Comput. Syst. – start-page: 2005 year: 2005 ident: bib0014 article-title: Histogram of oriented gradients for human detection publication-title: IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognit. – volume: 25 start-page: 93 year: 2010 end-page: 104 ident: bib0009 article-title: Segmentation method for web page analysis using shrinking and dividing publication-title: Int. J. Parallel, Emergent Distrib. Syst. – year: 2008 ident: bib0035 article-title: Visual complexity and aesthetic perception of web pages publication-title: SIGDOC’08 – start-page: 65 year: 2013 end-page: 69 ident: bib0052 article-title: Layout-tree-based approach for identifying visually similar blocks in a web page publication-title: 12th International Conference on Computer and Information Science – reference: W3C, (2015), “World wide web consortium”, – volume: 60 start-page: 91 year: 2004 end-page: 110 ident: bib0033 article-title: Distinctive image features from scale invariant keypoints publication-title: IJCV – volume: 14 start-page: 1893 year: 2008 end-page: 1910 ident: bib0025 article-title: Recognizing informative web page blocks using visual segmentation for efficient information extraction publication-title: J. Univ. Comput. Sci. – reference: Cai, D., Yu, S., Wen, J.-R., & Ma, W.-Y. (2003). VIPS: a vision based pages segmentation algorithm, Technical Report MSR-TR-2003-79, Microsoft Research. – volume: 44 start-page: 386 year: 2008 end-page: 399 ident: bib0042 article-title: Aesthetics and credibility in web site design publication-title: Inf. Process. Manage. – volume: 32 start-page: 1582 year: 2010 end-page: 1596 ident: bib0050 article-title: Evaluating color descriptors for object and scene recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: Bay, H., Tuytelaars, T., Van Gool, L. (2006). SURF: speeded up robust features, ICCV’06. – start-page: 2169 year: 2006 end-page: 2178 ident: bib0030 article-title: Beyond bags of features: spatial pyramid matching recognizing natural scene categories publication-title: Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition – reference: (11.8. 2016) – reference: O'Hara, S., & Draper, B.-A. (2011). Introduction to the bag of features paradigm for image classification, – volume: 31 start-page: 739 year: 2012 end-page: 751 ident: bib0036 article-title: What should a corporate website look like? The influence of gestalt principles and visualisation in website design on the degree of acceptance and recommendation publication-title: Behav. Inf. Technol. – start-page: 457 year: 2014 end-page: 470 ident: bib0006 article-title: Similay: A developing web page layout based visual similarity search engine publication-title: Proc. 10th International Conference of Machine Learning and Data Mining – start-page: 13 year: 2012 end-page: 21 ident: bib0003 article-title: Measuring web page similarity based on textual and visual properties publication-title: . LNCS – reference: , (16.9.2015). – volume: 3 start-page: 301 year: 2006 end-page: 311 ident: bib0019 article-title: Detecting phishing pages with visual similarity assessment based on earth mover's distance publication-title: IEEE Trans. Dependable Secure Computi. – volume: 67 start-page: 135 year: 2010 end-page: 146 ident: bib0028 article-title: Visual similarities of web pages publication-title: , LNCS – volume: 28 start-page: 483 year: 2015 end-page: 497 ident: bib0031 article-title: Moving object classification using local shape and hog features in wavelet-transformed space with hierarchical svm classifiers publication-title: Applied Soft Comput. – volume: 87 start-page: 1 year: 2003 end-page: 32 ident: bib0022 article-title: Similarity as transformation publication-title: Cognition – volume: 46 start-page: 604 year: 1999 end-page: 632 ident: bib0026 article-title: Authorative sources in hyperlinked environment publication-title: J. ACM – volume: 70 start-page: 155 year: 2016 end-page: 175 ident: bib0040 article-title: A layout inference algorithm for graphical user interfaces publication-title: Inf. Software Technol. – start-page: 56 year: 2009 end-page: 63 ident: bib0013 article-title: Fighting phishing with discriminative key point features publication-title: Conference IEEE Internet Computing – reference: . – start-page: 16 year: 2009 end-page: 18 ident: bib0024 article-title: Web document text and image extraction using dom analysis and natural language processing publication-title: Proc. 9th ACM Symposium on Document Engineering – start-page: 454 year: 2007 end-page: 463 ident: bib0043 article-title: A layout-similarity-based approach for detecting phishing pages – volume: 10 year: 2010 ident: bib0012 article-title: Detecting visually similar web pages: application to phishing detection publication-title: ACM Trans. Internet Technol. – year: 1997 ident: bib0017 article-title: The Logic of Failure – volume: 60 start-page: 222 year: 2007 end-page: 234 ident: bib0018 article-title: Exploiting structural similarity for effective web information extraction publication-title: Data Knowl. Eng. – start-page: 401 year: 2007 end-page: 408 ident: bib0005 article-title: Representing shape with a spatial pyramid Kernel publication-title: Conference of CIVR’07 – reference: OpenCV, (2015). “OpenCV Project”, – reference: ASP.NET, 2015, “ASP.NET platform”, – start-page: 203 year: 2004 end-page: 211 ident: bib0046 article-title: Learning block importance models for web pages publication-title: Proc. WWW’04 Conference – ident: 10.1016/j.ijhcs.2017.10.008_bib0020 – start-page: 2049 year: 2013 ident: 10.1016/j.ijhcs.2017.10.008_bib0041 article-title: Predicting users’ first impressions of website aesthetics with a quantification of perceived visual complexity and colorfulness publication-title: Proc. SIGCHI Conf. Human Factors Comput. Syst. doi: 10.1145/2470654.2481281 – year: 2008 ident: 10.1016/j.ijhcs.2017.10.008_bib0011 article-title: FCTH: fuzzy color and texture histogram - a low level feature for accurate image retrieval – volume: 14 start-page: 1893 issue: 11 year: 2008 ident: 10.1016/j.ijhcs.2017.10.008_bib0025 article-title: Recognizing informative web page blocks using visual segmentation for efficient information extraction publication-title: J. Univ. Comput. Sci. – start-page: 2169 year: 2006 ident: 10.1016/j.ijhcs.2017.10.008_bib0030 article-title: Beyond bags of features: spatial pyramid matching recognizing natural scene categories – volume: 44 start-page: 386 year: 2008 ident: 10.1016/j.ijhcs.2017.10.008_bib0042 article-title: Aesthetics and credibility in web site design publication-title: Inf. Process. Manage. doi: 10.1016/j.ipm.2007.02.003 – ident: 10.1016/j.ijhcs.2017.10.008_bib0037 – start-page: 56 year: 2009 ident: 10.1016/j.ijhcs.2017.10.008_bib0013 article-title: Fighting phishing with discriminative key point features – volume: 31 start-page: 739 issue: 7 year: 2012 ident: 10.1016/j.ijhcs.2017.10.008_bib0036 article-title: What should a corporate website look like? The influence of gestalt principles and visualisation in website design on the degree of acceptance and recommendation publication-title: Behav. Inf. Technol. doi: 10.1080/0144929X.2011.642893 – start-page: 487 year: 2005 ident: 10.1016/j.ijhcs.2017.10.008_bib0048 article-title: Factors affecting web page similarity – volume: 70 start-page: 155 year: 2016 ident: 10.1016/j.ijhcs.2017.10.008_bib0040 article-title: A layout inference algorithm for graphical user interfaces publication-title: Inf. Software Technol. doi: 10.1016/j.infsof.2015.10.005 – volume: 37 start-page: 231 issue: 3 year: 2013 ident: 10.1016/j.ijhcs.2017.10.008_bib0053 article-title: Web phishing detection based on page spatial layout similarity publication-title: Informatica – start-page: 457 year: 2014 ident: 10.1016/j.ijhcs.2017.10.008_bib0006 article-title: Similay: A developing web page layout based visual similarity search engine – year: 2008 ident: 10.1016/j.ijhcs.2017.10.008_bib0035 article-title: Visual complexity and aesthetic perception of web pages publication-title: SIGDOC’08 – volume: 60 start-page: 222 issue: 1 year: 2007 ident: 10.1016/j.ijhcs.2017.10.008_bib0018 article-title: Exploiting structural similarity for effective web information extraction publication-title: Data Knowl. Eng. doi: 10.1016/j.datak.2006.01.001 – start-page: 1 year: 2012 ident: 10.1016/j.ijhcs.2017.10.008_bib0029 article-title: Structural and visual similarity learning for web page archiving – volume: 87 start-page: 1 year: 2003 ident: 10.1016/j.ijhcs.2017.10.008_bib0022 article-title: Similarity as transformation publication-title: Cognition doi: 10.1016/S0010-0277(02)00184-1 – year: 1997 ident: 10.1016/j.ijhcs.2017.10.008_bib0017 – start-page: 13 year: 2012 ident: 10.1016/j.ijhcs.2017.10.008_bib0003 article-title: Measuring web page similarity based on textual and visual properties – volume: 28 start-page: 483 year: 2015 ident: 10.1016/j.ijhcs.2017.10.008_bib0031 article-title: Moving object classification using local shape and hog features in wavelet-transformed space with hierarchical svm classifiers publication-title: Applied Soft Comput. doi: 10.1016/j.asoc.2014.09.051 – year: 2004 ident: 10.1016/j.ijhcs.2017.10.008_bib0008 article-title: Block-based web search – start-page: 1470 year: 2003 ident: 10.1016/j.ijhcs.2017.10.008_bib0044 article-title: Video google: a text retrieval approach to object matching in videos – start-page: 312 year: 2008 ident: 10.1016/j.ijhcs.2017.10.008_bib0010 article-title: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval – volume: 25 start-page: 93 issue: 2 year: 2010 ident: 10.1016/j.ijhcs.2017.10.008_bib0009 article-title: Segmentation method for web page analysis using shrinking and dividing publication-title: Int. J. Parallel, Emergent Distrib. Syst. doi: 10.1080/17445760802429585 – ident: 10.1016/j.ijhcs.2017.10.008_bib0051 – start-page: 2005 year: 2005 ident: 10.1016/j.ijhcs.2017.10.008_bib0014 article-title: Histogram of oriented gradients for human detection publication-title: IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognit. – volume: 7 start-page: 155 year: 1983 ident: 10.1016/j.ijhcs.2017.10.008_bib0021 article-title: Structure-mapping: a theoritcal framework for analogy publication-title: Cognitive Sci. doi: 10.1207/s15516709cog0702_3 – ident: 10.1016/j.ijhcs.2017.10.008_bib0016 – volume: 3 start-page: 301 issue: 4 year: 2006 ident: 10.1016/j.ijhcs.2017.10.008_bib0019 article-title: Detecting phishing pages with visual similarity assessment based on earth mover's distance publication-title: IEEE Trans. Dependable Secure Computi. doi: 10.1109/TDSC.2006.50 – volume: 32 start-page: 1582 issue: 9 year: 2010 ident: 10.1016/j.ijhcs.2017.10.008_bib0050 article-title: Evaluating color descriptors for object and scene recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2009.154 – volume: 10 issue: 2 year: 2010 ident: 10.1016/j.ijhcs.2017.10.008_bib0012 article-title: Detecting visually similar web pages: application to phishing detection publication-title: ACM Trans. Internet Technol. doi: 10.1145/1754393.1754394 – start-page: 16 year: 2009 ident: 10.1016/j.ijhcs.2017.10.008_bib0024 article-title: Web document text and image extraction using dom analysis and natural language processing – volume: 52 start-page: 27 issue: 1 year: 2005 ident: 10.1016/j.ijhcs.2017.10.008_bib0049 article-title: Testing the visual consistency of web sites publication-title: Tech. Commun. – start-page: 454 year: 2007 ident: 10.1016/j.ijhcs.2017.10.008_bib0043 article-title: A layout-similarity-based approach for detecting phishing pages – ident: 10.1016/j.ijhcs.2017.10.008_bib0002 – ident: 10.1016/j.ijhcs.2017.10.008_bib0004 doi: 10.1007/11744023_32 – volume: 67 start-page: 135 year: 2010 ident: 10.1016/j.ijhcs.2017.10.008_bib0028 article-title: Visual similarities of web pages publication-title: Adv. Intell. Web Master., LNCS doi: 10.1007/978-3-642-10687-3_13 – start-page: 1 year: 2010 ident: 10.1016/j.ijhcs.2017.10.008_bib0039 article-title: Vi-DIFF: understanding web pages changes – start-page: 65 year: 2013 ident: 10.1016/j.ijhcs.2017.10.008_bib0052 article-title: Layout-tree-based approach for identifying visually similar blocks in a web page – ident: 10.1016/j.ijhcs.2017.10.008_bib0038 – start-page: 203 year: 2004 ident: 10.1016/j.ijhcs.2017.10.008_bib0046 article-title: Learning block importance models for web pages – start-page: 401 year: 2007 ident: 10.1016/j.ijhcs.2017.10.008_bib0005 article-title: Representing shape with a spatial pyramid Kernel – year: 2005 ident: 10.1016/j.ijhcs.2017.10.008_bib0047 article-title: Visual similarity comparison for web page retrieval, – volume: 107 start-page: 589 issue: 6 year: 2013 ident: 10.1016/j.ijhcs.2017.10.008_bib0032 article-title: A computational theory of visual receptive fields publication-title: Biolog. Cybern. doi: 10.1007/s00422-013-0569-z – volume: 60 start-page: 91 issue: 2 year: 2004 ident: 10.1016/j.ijhcs.2017.10.008_bib0033 article-title: Distinctive image features from scale invariant keypoints publication-title: IJCV doi: 10.1023/B:VISI.0000029664.99615.94 – volume: 46 start-page: 604 issue: 5 year: 1999 ident: 10.1016/j.ijhcs.2017.10.008_bib0026 article-title: Authorative sources in hyperlinked environment publication-title: J. ACM doi: 10.1145/324133.324140 – ident: 10.1016/j.ijhcs.2017.10.008_bib0007 – volume: 22 start-page: 115 issue: 3 year: 2005 ident: 10.1016/j.ijhcs.2017.10.008_bib0034 article-title: That site looks 88.46% familiar: quantifying similarity of web page design publication-title: Expert Syst. doi: 10.1111/j.1468-0394.2005.00302.x – volume: 27 start-page: 793 issue: 8 year: 2011 ident: 10.1016/j.ijhcs.2017.10.008_bib0045 article-title: the role of structure and content in perception of visual similarity between web pages publication-title: Int. J. Human-Comput. Interaction – start-page: 45 year: 2010 ident: 10.1016/j.ijhcs.2017.10.008_bib0001 article-title: A tool for computing visual similarity of web pages – volume: 1 start-page: 3057 issue: 10 year: 2006 ident: 10.1016/j.ijhcs.2017.10.008_bib0015 article-title: Measuring the structural similarity of web-based documents: a novel approach publication-title: Int. J. Comput. Intelligence – start-page: 30 year: 2009 ident: 10.1016/j.ijhcs.2017.10.008_bib0023 article-title: Visual similarity-based phishing detection without victim site information  | 
    
| SSID | ssj0007400 | 
    
| Score | 2.298043 | 
    
| Snippet | •A novel web page layout similarity measuring and ranking approach is proposed•Visual layout of a web page is enabled to be used as a query item•As a... | 
    
| SourceID | crossref elsevier  | 
    
| SourceType | Enrichment Source Index Database Publisher  | 
    
| StartPage | 95 | 
    
| SubjectTerms | Bag of features Histogram of oriented gradients Layout similarity Similarity ranking Spatial pyramid matching Web page layout  | 
    
| Title | Layout-based computation of web page similarity ranks | 
    
| URI | https://dx.doi.org/10.1016/j.ijhcs.2017.10.008 | 
    
| Volume | 110 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 1095-9300 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007400 issn: 1071-5819 databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Complete Freedom Collection [SCCMFC] customDbUrl: eissn: 1095-9300 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007400 issn: 1071-5819 databaseCode: ACRLP dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals customDbUrl: eissn: 1095-9300 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007400 issn: 1071-5819 databaseCode: AIKHN dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect (Elsevier) customDbUrl: eissn: 1095-9300 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007400 issn: 1071-5819 databaseCode: .~1 dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVLSH databaseName: Elsevier Journals customDbUrl: mediaType: online eissn: 1095-9300 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0007400 issn: 1071-5819 databaseCode: AKRWK dateStart: 19940101 isFulltext: true providerName: Library Specific Holdings  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELaqssCAeIryqDww4jaOHdsdq4qqvLpApW6Rc3FEKmgrmg4s_HZsJ4EiIQbGWD4p-WLf3Sfd3YfQZcaMiAxoYo-DJDwBRhRLLVmx0TXqZcLQwDUKP4zFaMJvp9G0gQZ1L4wrq6x8f-nTvbeuVroVmt1lnncfLXGhkecbjhYI12jOuXQqBp2P7zIPyYNyIoGkxO2uJw_5Gq989gxuZjeVHV_ipX6PThsRZ7iHdqtUEffLt9lHDTM_QDsbAwQPUXSv3xfrgrhYlGLwCg0earzIsHWQ2LkLvMpfc0tgbb6NnUT76ghNhtdPgxGphBAI2AhTEAi1l46iytgvNmBAUA0mZJylPJXCUdxQGaNYZvkPU9ooymkCEHIFGiQ7Rs35Ym5OEA7sBWYsCYDKjDMtE8iMZtze6kxI6LEWCmsAYqimhDuxipe4LgebxR612KHmFi1qLXT1ZbQsh2T8vV3UyMY__nVs3fhfhqf_NTxD2_ZJldXW56hZvK3NhU0miqTtT0sbbfVv7kbjT3rYx8Y | 
    
| linkProvider | Elsevier | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV09T8MwELVKGYAB8SnKpwdG3MaxE7sjqqgKtF1opW6Rc3FEKmgrmg4s_HZsJ4EiIQZWx5aSF_vdPen8DqHrlOkw0KCI2Q6C8BgYkSwxYsVE16Cdhpp69qLwYBj2xvxhEkxqqFPdhbFllSX3F5zu2LocaZVothZZ1noywoUGTm9YWRAGG2iTB76wCqz58V3nIbhXWBIISuz0ynrIFXll02ewpt1UNF2Nl_w9PK2FnO4e2i1zRXxbvM4-qunZAdpZcxA8REFfvc9XObHBKMHgWjQ4rPE8xYYhseULvMxeM6NgTcKNbY_25REad-9GnR4pOyEQMCEmJ-Ar1zuKSm0-WYOGkCrQPuMs4YkIrcb1pdaSpUYAMam0pJzGAD6XoECwY1SfzWf6BGHPnGDGYg-oSDlTIoZUK8bNsU5DAW3WQH4FQASlTbjtVvESVfVg08ihFlnU7KBBrYFuvhYtCpeMv6eHFbLRj58dGR7_a-Hpfxdeoa3eaNCP-vfDxzO0bZ7IovT6HNXzt5W-MJlFHl-6nfMJU-_JWw | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Layout-based+computation+of+web+page+similarity+ranks&rft.jtitle=International+journal+of+human-computer+studies&rft.au=Bozkir%2C+Ahmet+Selman&rft.au=Akcapinar+Sezer%2C+Ebru&rft.date=2018-02-01&rft.pub=Elsevier+Ltd&rft.issn=1071-5819&rft.eissn=1095-9300&rft.volume=110&rft.spage=95&rft.epage=114&rft_id=info:doi/10.1016%2Fj.ijhcs.2017.10.008&rft.externalDocID=S1071581917301465 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1071-5819&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1071-5819&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1071-5819&client=summon |