Enabling Fraud Prediction on Preliminary Data Through Information Density Booster
In online lending services, fraud prediction is an especially critical step to control loss risk and improve processing efficiency. Unfortunately, it is definitely challenging since the ex-ante prediction actually needs to be made only based on the most basic information of applicants. This work fig...
Saved in:
Published in | IEEE transactions on information forensics and security Vol. 18; p. 1 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 1556-6013 1556-6021 |
DOI | 10.1109/TIFS.2023.3300523 |
Cover
Abstract | In online lending services, fraud prediction is an especially critical step to control loss risk and improve processing efficiency. Unfortunately, it is definitely challenging since the ex-ante prediction actually needs to be made only based on the most basic information of applicants. This work figures out that the essential difficulty here is the low information density of data associations which contain the useful information for fraud prediction. Accordingly, we propose a novel multi-stage data representation scheme, called AI2Vec (Applicant Information Vectoring), as an information density booster. It can gradually boost information density of associations by simultaneously decreasing the scale of information carriers and increasing the amount of useful information. The qualified performance of our AI2Vec is validated by the experiments over real-life data from a prestigious online lending platform. It can help commonly-used machine learning classifiers outperform the state-of-the-art methods, including the method of pilot platform with manual feature engineering by the subject matter experts. |
---|---|
AbstractList | In online lending services, fraud prediction is an especially critical step to control loss risk and improve processing efficiency. Unfortunately, it is definitely challenging since the ex-ante prediction actually needs to be made only based on the most basic information of applicants. This work figures out that the essential difficulty here is the low information density of data associations which contain the useful information for fraud prediction. Accordingly, we propose a novel multi-stage data representation scheme, called AI2Vec (Applicant Information Vectoring), as an information density booster. It can gradually boost information density of associations by simultaneously decreasing the scale of information carriers and increasing the amount of useful information. The qualified performance of our AI2Vec is validated by the experiments over real-life data from a prestigious online lending platform. It can help commonly-used machine learning classifiers outperform the state-of-the-art methods, including the method of pilot platform with manual feature engineering by the subject matter experts. |
Author | Wang, Cheng Zhu, Hangyu |
Author_xml | – sequence: 1 givenname: Hangyu orcidid: 0000-0001-7221-9130 surname: Zhu fullname: Zhu, Hangyu organization: Department of Computer Science and Engineering, Tongji University, China – sequence: 2 givenname: Cheng orcidid: 0000-0002-4752-0316 surname: Wang fullname: Wang, Cheng organization: Department of Computer Science and Engineering, Tongji University, China |
BookMark | eNp9kFtLAzEQhYNUsK3-AMGHBZ-35rK72TxqL1ooqFifQzabtCnbpCa7D_33phdEfBAGZgbOmeF8A9CzzioAbhEcIQTZw3I--xhhiMmIEAhzTC5AH-V5kRYQo97PjMgVGISwgTDLUFH2wfvUiqoxdpXMvOjq5M2r2sjWOJvEiltjtsYKv08mohXJcu1dt1onc6ud34qjbqJsMO0-eXIutMpfg0stmqBuzn0IPmfT5fglXbw-z8ePi1RilrUpITWlqKwxY1rVAkGpiCCaSEZrlccQVU4rpqmGsqY5wlgwjaO4kgQJKCEZgvvT3Z13X50KLd-4ztv4kuOyoIRhglBUoZNKeheCV5rvvNnGPBxBfiDHD-T4gRw_k4se-scjTXvM2nphmn-ddyenUUr9-oRYScqMfANcwn2M |
CODEN | ITIFA6 |
CitedBy_id | crossref_primary_10_1109_TSC_2024_3422880 |
Cites_doi | 10.24963/ijcai.2018/438 10.1109/TIFS.2015.2482465 10.1109/TDSC.2013.3 10.1080/08839514.2018.1525517 10.1145/3357384.3358052 10.1109/DSAA.2019.00050 10.1145/3366423.3380159 10.1109/SP46214.2022.9833779 10.1145/2783258.2789985 10.1145/3308560.3316586 10.1145/3357384.3357804 10.1145/3331184.3331372 10.1109/ICDM.2019.00070 10.1109/TNNLS.2018.2870573 10.1109/TKDE.2020.2981333 10.1145/2939672.2939754 10.1109/ICIRCA54612.2022.9985674 10.1109/TIFS.2018.2883000 10.1145/3437963.3441743 10.1145/3292500.3330693 10.14722/ndss.2015.23260 10.1109/TDSC.2022.3151132 10.1109/TKDE.2017.2767599 10.1609/aaai.v32i1.11411 10.1109/ICPR.2018.8545474 10.1145/3289600.3290959 10.1145/2623330.2623732 10.1109/BigData52589.2021.9672028 10.1145/3308558.3313715 10.1145/3097983.3098036 10.1109/TKDE.2019.2924431 10.1609/aaai.v34i04.5906 10.1109/TIFS.2019.2930479 10.1109/COINS54846.2022.9854964 10.1109/ICDM.2018.00028 10.1145/2939672.2939753 10.1109/TKDE.2018.2819980 10.1109/ICDM50108.2020.00098 10.1609/aaai.v33i01.3301946 10.1145/3132847.3132953 10.1109/TDSC.2020.2991872 10.1145/3340531.3412724 10.1109/TKDE.2019.2912817 10.1145/3397271.3401253 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 7TB 8FD FR3 JQ2 KR7 L7M L~C L~D |
DOI | 10.1109/TIFS.2023.3300523 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Civil Engineering Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Civil Engineering Abstracts |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1556-6021 |
EndPage | 1 |
ExternalDocumentID | 10_1109_TIFS_2023_3300523 10198384 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61972287 funderid: 10.13039/501100001809 |
GroupedDBID | 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS HZ~ IFIPE IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RNS 5VS AAYXX AETIX AGSQL CITATION EJD 7SC 7SP 7TB 8FD FR3 JQ2 KR7 L7M L~C L~D |
ID | FETCH-LOGICAL-c294t-33d7718d299feda10ce3a3f3c97de5330b57b9f7f0cd75122a9f2299bc31a0c03 |
IEDL.DBID | RIE |
ISSN | 1556-6013 |
IngestDate | Mon Jun 30 02:57:16 EDT 2025 Wed Oct 01 02:18:32 EDT 2025 Thu Apr 24 22:59:52 EDT 2025 Wed Aug 27 02:57:15 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c294t-33d7718d299feda10ce3a3f3c97de5330b57b9f7f0cd75122a9f2299bc31a0c03 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-4752-0316 0000-0001-7221-9130 |
PQID | 2867392311 |
PQPubID | 85506 |
PageCount | 1 |
ParticipantIDs | crossref_primary_10_1109_TIFS_2023_3300523 proquest_journals_2867392311 crossref_citationtrail_10_1109_TIFS_2023_3300523 ieee_primary_10198384 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2023-01-01 |
PublicationDateYYYYMMDD | 2023-01-01 |
PublicationDate_xml | – month: 01 year: 2023 text: 2023-01-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on information forensics and security |
PublicationTitleAbbrev | TIFS |
PublicationYear | 2023 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref12 ref15 ref14 levy (ref22) 2006 ref11 ref10 ref19 ref18 ref51 ref50 ref46 ref45 ref48 ref47 ref42 ref41 ref44 ref43 adams (ref27) 2016 ref49 ref7 ref9 ref4 ref3 ref6 ref5 ref40 ref35 ref34 ref31 van der heijden (ref37) 2019 ref30 ref33 ref32 ref2 ref1 ref39 ref38 yuan (ref16) 2018 ref24 ref23 ref26 ref25 ref20 ref21 ref28 din (ref8) 2020 tong (ref17) 2019 ref29 yang (ref52) 2015 mikolov (ref36) 2013 |
References_xml | – ident: ref43 doi: 10.24963/ijcai.2018/438 – ident: ref9 doi: 10.1109/TIFS.2015.2482465 – ident: ref44 doi: 10.1109/TDSC.2013.3 – ident: ref41 doi: 10.1080/08839514.2018.1525517 – ident: ref20 doi: 10.1145/3357384.3358052 – ident: ref32 doi: 10.1109/DSAA.2019.00050 – start-page: 1309 year: 2019 ident: ref37 article-title: Cognitive triaging of phishing attacks publication-title: Proc 28th USENIX Secur Symp (USENIX) – ident: ref10 doi: 10.1145/3366423.3380159 – start-page: 1571 year: 2020 ident: ref8 article-title: Boxer: Preventing fraud by scanning credit cards publication-title: Proc 29th USENIX Secur Symp (USENIX) – ident: ref26 doi: 10.1109/SP46214.2022.9833779 – ident: ref13 doi: 10.1145/2783258.2789985 – ident: ref1 doi: 10.1145/3308560.3316586 – ident: ref15 doi: 10.1145/3357384.3357804 – ident: ref48 doi: 10.1145/3331184.3331372 – year: 2016 ident: ref27 publication-title: How Tala Mobile is Using Phone Data to Revolutionize Microfinance Forbes com – ident: ref12 doi: 10.1109/ICDM.2019.00070 – ident: ref19 doi: 10.1109/TNNLS.2018.2870573 – ident: ref21 doi: 10.1109/TKDE.2020.2981333 – ident: ref25 doi: 10.1145/2939672.2939754 – ident: ref28 doi: 10.1109/ICIRCA54612.2022.9985674 – ident: ref11 doi: 10.1109/TIFS.2018.2883000 – ident: ref5 doi: 10.1145/3437963.3441743 – ident: ref18 doi: 10.1145/3292500.3330693 – ident: ref14 doi: 10.14722/ndss.2015.23260 – start-page: 3111 year: 2013 ident: ref36 article-title: Distributed representations of words and phrases and their compositionality publication-title: Proc Adv Neural Inf Process Syst – start-page: 2111 year: 2015 ident: ref52 article-title: Network representation learning with rich text information publication-title: Proc IJCAI – ident: ref4 doi: 10.1109/TDSC.2022.3151132 – ident: ref34 doi: 10.1109/TKDE.2017.2767599 – ident: ref38 doi: 10.1609/aaai.v32i1.11411 – ident: ref7 doi: 10.1109/ICPR.2018.8545474 – ident: ref3 doi: 10.1145/3289600.3290959 – ident: ref40 doi: 10.1145/2623330.2623732 – ident: ref31 doi: 10.1109/BigData52589.2021.9672028 – ident: ref24 doi: 10.1145/3308558.3313715 – ident: ref51 doi: 10.1145/3097983.3098036 – ident: ref2 doi: 10.1109/TKDE.2019.2924431 – ident: ref47 doi: 10.1609/aaai.v34i04.5906 – start-page: 849 year: 2006 ident: ref22 article-title: Speakers optimize information density through syntactic reduction publication-title: Proc Adv Neural Inf Process Syst – ident: ref45 doi: 10.1109/TIFS.2019.2930479 – ident: ref29 doi: 10.1109/COINS54846.2022.9854964 – ident: ref39 doi: 10.1109/ICDM.2018.00028 – ident: ref50 doi: 10.1145/2939672.2939753 – ident: ref42 doi: 10.1109/TKDE.2018.2819980 – start-page: 285 year: 2019 ident: ref17 article-title: Improving robustness of ML classifiers against realizable evasion attacks using conserved features publication-title: Proc 28th USENIX Secur Symp (USENIX) – start-page: 1027 year: 2018 ident: ref16 article-title: Reading thieves' cant: Automatically identifying and understanding dark jargons from cybercrime marketplaces publication-title: Proc 27th USENIX Secur Symp (USENIX) – ident: ref30 doi: 10.1109/ICDM50108.2020.00098 – ident: ref33 doi: 10.1609/aaai.v33i01.3301946 – ident: ref23 doi: 10.1145/3132847.3132953 – ident: ref46 doi: 10.1109/TDSC.2020.2991872 – ident: ref6 doi: 10.1145/3340531.3412724 – ident: ref49 doi: 10.1109/TKDE.2019.2912817 – ident: ref35 doi: 10.1145/3397271.3401253 |
SSID | ssj0044168 |
Score | 2.3614519 |
Snippet | In online lending services, fraud prediction is an especially critical step to control loss risk and improve processing efficiency. Unfortunately, it is... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 1 |
SubjectTerms | Boosting Data mining Density Fraud Fraud prediction History information density Machine learning network representation learning online lending services Representation learning Risk management Semantics Task analysis |
Title | Enabling Fraud Prediction on Preliminary Data Through Information Density Booster |
URI | https://ieeexplore.ieee.org/document/10198384 https://www.proquest.com/docview/2867392311 |
Volume | 18 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1556-6021 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0044168 issn: 1556-6013 databaseCode: RIE dateStart: 20060101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH7oTnpwOhWnU3LwJLSmTds0R3WO6WEobuCt5Fcvjk1md5h_vUmaylQUoYcUXkrhS_Jekve9D-A8EzIjJSFBpBkNEoHTILdknzLJdRljWkrusi1G2XCS3D-nz56s7rgwWmuXfKZD23R3-Woul_aozMxws0UmebIJm5SymqzVLLvGrde8tzTNArPLIP4KM8Lscnw3eAqtTnhIiDsH_eKEnKrKj6XY-ZdBG0bNn9VpJS_hshKhfP9WtPHfv74LOz7SRFf10NiDDT3rQLtRcUB-Undge60k4T483loylWkiE9IuFXpY2Jscix4yj3mbOhmwxQr1ecXRuJb5QZ7V5Oz6Nie-WqHruSWQLA5gMrgd3wwDL7sQyJglVUCIosZjKeOoSq14hKUm3OApGVXaJqOKlApW0hJLRU28EHNWxsZYSBJxLDE5hNZsPtNHgKQkSrFccBaJhMRpnuUqk5gniRB5JngXcINDIX1NciuNMS3c3gSzwkJXWOgKD10XLj67vNYFOf4yPrBQrBnWKHSh16Bd-Dn7VsR5RomNd6PjX7qdwJb9en0C04NWtVjqUxOTVOLMjcUPzA7cqw |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwEB2xHIADZRVl9YETUoITJ3F8ZKtalgpEkbhF3nKhalFJD_D12I6DWARCysGRxrKlZ3tm7HkzAIeZkBkpCQkizWiQCJwGuSX7lEmuyxjTUnIXbdHPug_J5WP66MnqjgujtXbBZzq0TfeWr8Zyaq_KzA43LjLJk1mYT41bQWu6VnPwGsVeM9_SNAuMn0H8I2aE2fGg17kPbaXwkBB3E_pFDbm6Kj8OY6dhOi3oN3OrA0uewmklQvn2LW3jvye_Asve1kQn9eJYhRk9WoNWU8cB-W29BkufkhKuw92FpVOZJjJG7VSh24l9y7H4IfOZv6ErBDZ5Ree84mhQF_pBntfk5M5tVHz1ik7HlkIy2YCHzsXgrBv4wguBjFlSBYQoanSWMqqq1IpHWGrCDaKSUaVtOKpIqWAlLbFU1FgMMWdlbISFJBHHEpNNmBuNR3oLkJREKZYLziKRkDjNs1xlEvMkESLPBG8DbnAopM9KbotjDAvnnWBWWOgKC13hoWvD0UeX5zolx1_CGxaKT4I1Cm3YbdAu_K59KeI8o8RavNH2L90OYKE7uLkurnv9qx1YtCPV9zG7MFdNpnrPWCiV2Hfr8h2Nxt_8 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Enabling+Fraud+Prediction+on+Preliminary+Data+Through+Information+Density+Booster&rft.jtitle=IEEE+transactions+on+information+forensics+and+security&rft.au=Zhu%2C+Hangyu&rft.au=Wang%2C+Cheng&rft.date=2023-01-01&rft.pub=IEEE&rft.issn=1556-6013&rft.spage=1&rft.epage=1&rft_id=info:doi/10.1109%2FTIFS.2023.3300523&rft.externalDocID=10198384 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1556-6013&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1556-6013&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1556-6013&client=summon |