Hybrid Approach for Phishing Website Detection Using Classification Algorithms

The internet has significantly altered how we work and interact with one another.Statistics show 63.1 % of the present world population are internet users. This clearly indicates how heavily man is dependent on digital media. Digital media users are on the rise and so is the incidence of  cyber crim...

Full description

Saved in:
Bibliographic Details
Published inParadigmPlus Vol. 3; no. 3; pp. 16 - 29
Main Authors Raj, Mukta Mithra, Arul Jothi, J. Angel
Format Journal Article
LanguageEnglish
Published ITI Research Group 20.12.2022
Subjects
Online AccessGet full text
ISSN2711-4627
2711-4627
DOI10.55969/paradigmplus.v3n3a2

Cover

Abstract The internet has significantly altered how we work and interact with one another.Statistics show 63.1 % of the present world population are internet users. This clearly indicates how heavily man is dependent on digital media. Digital media users are on the rise and so is the incidence of  cyber crimes. People who lack experience and knowledge are more vulnerable and susceptible to phishing scams.The victims experience severe consequences as their personal credentials are at stake. Phishers use publicly available sources to acquire details about the victim's professional and personal history.Countermeasures must be implemented with the highest priority. Detection of malicious websites can significantly reduce the risk of phishing attempts.In this research, a highly accurate website phishing detection method based on URL features is proposed. We investigated eight existing machine learning classification techniques for this, including extreme gradient boosting (XGBoost), random forest (RF), adaptive boosting (AdaBoost), decision trees (DT), K-nearest neighbors (KNN), support vector machines (SVM), logistic regression and naïve bayes (NB) to detect malicious websites.The results show that XGboost had the best accuracy  with a score of 96.71%, followed by random forest and AdaBoost.We further experimented with various hybrid combinations of the top three classifiers and observed that XGboost-Random Forest hybrid algorithms produced the best results.The hybrid model classified the websites as legitimate or phishing with an accuracy of 97.07%.
AbstractList The internet has significantly altered how we work and interact with one another.Statistics show 63.1 % of the present world population are internet users. This clearly indicates how heavily man is dependent on digital media. Digital media users are on the rise and so is the incidence of  cyber crimes. People who lack experience and knowledge are more vulnerable and susceptible to phishing scams.The victims experience severe consequences as their personal credentials are at stake. Phishers use publicly available sources to acquire details about the victim's professional and personal history.Countermeasures must be implemented with the highest priority. Detection of malicious websites can significantly reduce the risk of phishing attempts.In this research, a highly accurate website phishing detection method based on URL features is proposed. We investigated eight existing machine learning classification techniques for this, including extreme gradient boosting (XGBoost), random forest (RF), adaptive boosting (AdaBoost), decision trees (DT), K-nearest neighbors (KNN), support vector machines (SVM), logistic regression and naïve bayes (NB) to detect malicious websites.The results show that XGboost had the best accuracy  with a score of 96.71%, followed by random forest and AdaBoost.We further experimented with various hybrid combinations of the top three classifiers and observed that XGboost-Random Forest hybrid algorithms produced the best results.The hybrid model classified the websites as legitimate or phishing with an accuracy of 97.07%.
Author Arul Jothi, J. Angel
Raj, Mukta Mithra
Author_xml – sequence: 1
  givenname: Mukta Mithra
  surname: Raj
  fullname: Raj, Mukta Mithra
– sequence: 2
  givenname: J. Angel
  surname: Arul Jothi
  fullname: Arul Jothi, J. Angel
BookMark eNqNkMtOAjEYRhuDiYi8gYt5AbD3Tt0RvEBC1IXEZfPT6UDJMJ20g4a3F8EY4kZX_fMl5zQ5l6hTh9ohdE3wUAgt9U0DEQq_3DTVNg3fWc2AnqEuVYQMuKSqc3JfoH5Ka4wxzZXkIu-ip8luEX2RjZomBrCrrAwxe1n5tPL1Mntzi-Rbl9251tnWhzqbp699XEFKvvQWDuOoWobo29UmXaHzEqrk-t9vD80f7l_Hk8Hs-XE6Hs0GluKcDkoqBGfCcrbQTBU8l4oprYnmoJQqhJK60IyWjGmsrcU6V8QShxeFYo4ozHpoevQWAdamiX4DcWcCeHMYQlwaiK23lTO5o1RQokrGMS9znUsimVTgnASMidq7xNG1rRvYfUBV_QgJNofG5rSxOTbec7dHzsaQUnSlsb49BGkj-OovmP-C__XnJ1HXnCE
CitedBy_id crossref_primary_10_1145_3611392
Cites_doi 10.1109/ACCESS.2022.3194672
10.3390/electronics10111285
10.1007/978-981-13-2354-6_27
10.3390/s21248281
10.1109/ICCRD54409.2022.9730579
10.1109/ICCCNT49239.2020.9225561
10.1109/ICOEI.2018.8553963
10.1007/978-3-030-16660-1_12
10.1109/BlackSeaCom52164.2021.9527806
10.1016/j.procs.2020.02.251
10.1007/978-981-15-8061-1_13
10.1109/ICAIS50930.2021.9395810
10.1108/EL-05-2019-0118
10.1016/S1361-3723(21)00082-8
10.1109/MysuruCon52639.2021.9641614
10.1007/s13278-021-00829-w
10.1109/ACCESS.2021.3124628
10.3390/s22093373
10.18421/TEM102-58
ContentType Journal Article
DBID AAYXX
CITATION
ADTOC
UNPAY
DOA
DOI 10.55969/paradigmplus.v3n3a2
DatabaseName CrossRef
Unpaywall for CDI: Periodical Content
Unpaywall
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList
CrossRef
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
EISSN 2711-4627
EndPage 29
ExternalDocumentID oai_doaj_org_article_8e225217f3404f898616367aee6a0017
10.55969/paradigmplus.v3n3a2
10_55969_paradigmplus_v3n3a2
GroupedDBID AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
GROUPED_DOAJ
M~E
ADTOC
UNPAY
ID FETCH-LOGICAL-c2082-f255435c43b937d48673799194a777d5769d932f33909cc09871c1e0bd73e1703
IEDL.DBID UNPAY
ISSN 2711-4627
IngestDate Fri Oct 03 12:44:07 EDT 2025
Tue Aug 19 18:18:50 EDT 2025
Thu Apr 24 22:51:20 EDT 2025
Wed Oct 01 01:56:09 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Issue 3
Language English
License https://creativecommons.org/licenses/by/4.0
cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2082-f255435c43b937d48673799194a777d5769d932f33909cc09871c1e0bd73e1703
OpenAccessLink https://proxy.k.utb.cz/login?url=https://journals.itiud.org/index.php/paradigmplus/article/download/39/45
PageCount 14
ParticipantIDs doaj_primary_oai_doaj_org_article_8e225217f3404f898616367aee6a0017
unpaywall_primary_10_55969_paradigmplus_v3n3a2
crossref_citationtrail_10_55969_paradigmplus_v3n3a2
crossref_primary_10_55969_paradigmplus_v3n3a2
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-12-20
PublicationDateYYYYMMDD 2022-12-20
PublicationDate_xml – month: 12
  year: 2022
  text: 2022-12-20
  day: 20
PublicationDecade 2020
PublicationTitle ParadigmPlus
PublicationYear 2022
Publisher ITI Research Group
Publisher_xml – name: ITI Research Group
References ref13
ref12
ref15
ref14
ref20
ref11
ref10
ref0
ref2
ref1
ref17
ref16
ref19
ref18
ref8
ref7
ref9
ref4
ref3
ref6
ref5
References_xml – ident: ref11
  doi: 10.1109/ACCESS.2022.3194672
– ident: ref14
  doi: 10.3390/electronics10111285
– ident: ref20
– ident: ref18
  doi: 10.1007/978-981-13-2354-6_27
– ident: ref13
  doi: 10.3390/s21248281
– ident: ref2
  doi: 10.1109/ICCRD54409.2022.9730579
– ident: ref17
  doi: 10.1109/ICCCNT49239.2020.9225561
– ident: ref19
  doi: 10.1109/ICOEI.2018.8553963
– ident: ref4
  doi: 10.1007/978-3-030-16660-1_12
– ident: ref7
  doi: 10.1109/BlackSeaCom52164.2021.9527806
– ident: ref9
  doi: 10.1016/j.procs.2020.02.251
– ident: ref12
  doi: 10.1007/978-981-15-8061-1_13
– ident: ref8
  doi: 10.1109/ICAIS50930.2021.9395810
– ident: ref10
  doi: 10.1108/EL-05-2019-0118
– ident: ref1
  doi: 10.1016/S1361-3723(21)00082-8
– ident: ref3
  doi: 10.1109/MysuruCon52639.2021.9641614
– ident: ref5
  doi: 10.1007/s13278-021-00829-w
– ident: ref0
– ident: ref15
  doi: 10.1109/ACCESS.2021.3124628
– ident: ref16
  doi: 10.3390/s22093373
– ident: ref6
  doi: 10.18421/TEM102-58
SSID ssj0002876458
Score 1.8197206
Snippet The internet has significantly altered how we work and interact with one another.Statistics show 63.1 % of the present world population are internet users....
SourceID doaj
unpaywall
crossref
SourceType Open Website
Open Access Repository
Enrichment Source
Index Database
StartPage 16
SubjectTerms Data Mining
Hybrid Classification Algorithms
Machine Learning
Phishing Website Detection
URL Features
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELZQF2BAIECUlzywpk1tN47H8qgqJCoGKrpFfqUtStOqD1D_PWcnVGEqA2vkxMl3jv2dffcdQneawqpjCQ-4NAoclFgH0p25KsFIKohKRerynV_6UW_AnoftYaXUl4sJK-SBC-CasYURB7w5pSxkaSziCBhExKW1kXRTrJt9w1hUnKkPv2XEI9aOi1w5IM2RaDopbTMZTefZetn4pDmV5Nda5CX7D9H-Op_LzZfMsso60z1GRyVBxJ3ixU7Qns1PUb-3cZlVuFMqgGOgmvh1XOwf4Xer3BkwfrQrH1iVYx8IgH3BSxcK5NHHnWw0W0xW4-nyDA26T28PvaCshBBo-DgSpED8gddoRhXQCeNU8igHZieY5Jwb8BmEASKWUipCoXUowA3SLRsqw6ltwU99jmr5LLcXCMfatBhXWpHQMKOUSAk8GHiYDS2VLK4j-oNJokuZcFetIkvAXfBIJlUkkwLJOgq2d80LmYwd7e8d3Nu2TuTaXwDTJ6Xpk12mr6PG1lh_6vXyP3q9QgfEpUC0CMww16i2WqztDRCTlbr1Y_Ab0OzjOw
  priority: 102
  providerName: Directory of Open Access Journals
Title Hybrid Approach for Phishing Website Detection Using Classification Algorithms
URI https://journals.itiud.org/index.php/paradigmplus/article/download/39/45
https://doaj.org/article/8e225217f3404f898616367aee6a0017
UnpaywallVersion publishedVersion
Volume 3
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2711-4627
  dateEnd: 20231231
  omitProxy: true
  ssIdentifier: ssj0002876458
  issn: 2711-4627
  databaseCode: DOA
  dateStart: 20200101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2711-4627
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0002876458
  issn: 2711-4627
  databaseCode: M~E
  dateStart: 20200101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwEB6V7QE48BAgtkDlA9e8bG8cHxdotULqqgdWlFPkV9pV0-yqm4DaA7-dsZNdtVwAiUsOke3EM5bnG3vmG4D3hqHVcVREQlmNDkphIuXvXLXktJJUV7Ly-c4n83y24J_PJmd7MNvmwgwS3MTLdtnZcJUfaAM9V0Ti2bDt8vxqXXebZJBsYj2p_ErZhMmETx7Afj5BVD6C_cX8dPrN15YTGbpJORV95hxC6FzeGyr-zhqm6D3LFAj8H8PDrlmrmx-qru9YneOnsNz-bx9schl3rY7N7W9Ujv9jQs_gyQBNybRv8Bz2XPMC5rMbn9NFpgP3OEGQS04v-pMr8tVpf_tMPrk2hHQ1JIQgkFBq0wchBb2TaX2-ul62F1ebl7A4PvrycRYNNRgiQxEdRBW6HIioDGcagYz1_HxMIKaUXAkhLHor0iIErBiTqTQmleiAmcyl2grmMtxOXsGoWTXuNZDC2IwLbTRNLbday4riwIgAXeqY4sUY2Fb-pRkIyn2djLpERyVorbwrr7LX2hiiXa91T9Dxh_YfvGp3bT29dniBmikHFZSFw20OnbWK8ZRXhSxyhK25UM7lytv1McS7hfFXXz341w5v4BH1iRYZxX3sLYza6869Q_jT6sNwbIDPk59Hh8Mq_wXufhIm
linkProvider Unpaywall
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwEB6V7QE48BAgtjzkA9e8bG8cH5dHtUJi1QMryinyK23UNLvqJqD21zN2squWCyBxjRwnmbE83xfPfAPwzjCMOo6KSCirkaAUJlL-zFVLTitJdSUrX-_8ZZkvVvzz6ez0ABa7WpjRgtu47urehqP8IBvotSISr4Zt67PLTdNvk9GyifWi8mtlEyYTPrsHh_kMUfkEDlfLk_l331tOZEiTciqGyjmE0Lm8M1X8g7VM0TuRKQj4P4T7fbtR1z9V09yKOsePod6975BschH3nY7NzW9Sjv_jg57AoxGakvkw4CkcuPYZLBfXvqaLzEftcYIgl5ycD3-uyDen_ekz-ei6kNLVkpCCQEKrTZ-EFPxO5s3Z-qruzi-3z2F1_Onrh0U09mCIDEV0EFVIORBRGc40Ahnr9fmYQEwpuRJCWGQr0iIErBiTqTQmlUjATOZSbQVzGW4nL2DSrlv3EkhhbMaFNpqmllutZUVxYkSALnVM8WIKbGf_0owC5b5PRlMiUQleK2_bqxy8NoVof9dmEOj4w_j33rX7sV5eO1xAz5SjC8rC4TaHZK1iPOVVIYscYWsulHO58nF9CvF-YfzVU4_-9YZX8ID6QouM4j72GibdVe_eIPzp9NtxZf8CJn0QAA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hybrid+Approach+for+Phishing+Website+Detection+Using+Classification+Algorithms&rft.jtitle=ParadigmPlus&rft.au=Mukta+Mithra+Raj&rft.au=J.+Angel+Arul+Jothi&rft.date=2022-12-20&rft.pub=ITI+Research+Group&rft.eissn=2711-4627&rft.volume=3&rft.issue=3&rft_id=info:doi/10.55969%2Fparadigmplus.v3n3a2&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_8e225217f3404f898616367aee6a0017
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2711-4627&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2711-4627&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2711-4627&client=summon