Cost-Sensitive Learning for Imbalanced Bad Debt Datasets in Healthcare Industry

The research using computational intelligence methods to improve bad debt recovery is imperative due to the rapid increase in the cost of healthcare in the U.S. This study explores effectiveness of using cost-sensitive learning methods to classify the unknown cases in imbalanced bad debt datasets an...

Full description

Saved in:
Bibliographic Details
Published in2015 Asia-Pacific Conference on Computer Aided System Engineering pp. 30 - 35
Main Authors Donghui Shi, Jian Guan, Zurada, Jozef
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2015
Subjects
Online AccessGet full text
DOI10.1109/APCASE.2015.13

Cover

Abstract The research using computational intelligence methods to improve bad debt recovery is imperative due to the rapid increase in the cost of healthcare in the U.S. This study explores effectiveness of using cost-sensitive learning methods to classify the unknown cases in imbalanced bad debt datasets and compares the results with those of two other methods: undersampling and oversampling, often used in processing imbalanced datasets. The study also analyzes the function of a semi-supervised learning algorithm in different circumstances. The results show that although the predictive accuracy rates with oversampling in balanced testing datasets is the best, it is unpractical due to the existence of imbalanced classes in real healthcare situations. The models constructed by undersampling have high classification accuracy rates of the minority class in imbalanced datasets, but they tend to make the overall classification accuracy rates of the majority class worse. The results show that cost-sensitive learning methods can improve the classification accuracy rates of the minority class in imbalanced datasets while achieving considerably good overall classification accuracy rates and classification accuracy rates of majority class. The results and analysis in this study show that cost-sensitive learning methods provide a potentially viable approach to classify the unknown cases in imbalanced bad debt datasets. At last, more practical predictive results are obtained by using the models to predict the unlabeled cases. Although oversampling and the cost-sensitive learning methods with the semi-supervised learning can improve the overall and majority class classification accuracy rates, the minority class classification accuracy rates are still relatively low. The semi-supervised learning algorithms need to be improved to adapt to the imbalanced bad debt datasets.
AbstractList The research using computational intelligence methods to improve bad debt recovery is imperative due to the rapid increase in the cost of healthcare in the U.S. This study explores effectiveness of using cost-sensitive learning methods to classify the unknown cases in imbalanced bad debt datasets and compares the results with those of two other methods: undersampling and oversampling, often used in processing imbalanced datasets. The study also analyzes the function of a semi-supervised learning algorithm in different circumstances. The results show that although the predictive accuracy rates with oversampling in balanced testing datasets is the best, it is unpractical due to the existence of imbalanced classes in real healthcare situations. The models constructed by undersampling have high classification accuracy rates of the minority class in imbalanced datasets, but they tend to make the overall classification accuracy rates of the majority class worse. The results show that cost-sensitive learning methods can improve the classification accuracy rates of the minority class in imbalanced datasets while achieving considerably good overall classification accuracy rates and classification accuracy rates of majority class. The results and analysis in this study show that cost-sensitive learning methods provide a potentially viable approach to classify the unknown cases in imbalanced bad debt datasets. At last, more practical predictive results are obtained by using the models to predict the unlabeled cases. Although oversampling and the cost-sensitive learning methods with the semi-supervised learning can improve the overall and majority class classification accuracy rates, the minority class classification accuracy rates are still relatively low. The semi-supervised learning algorithms need to be improved to adapt to the imbalanced bad debt datasets.
Author Zurada, Jozef
Donghui Shi
Jian Guan
Author_xml – sequence: 1
  surname: Donghui Shi
  fullname: Donghui Shi
  email: sdonghui@gmail.com
  organization: Dept. of Comput. Eng., Anhui Jianzhu Univ., Hefei, China
– sequence: 2
  surname: Jian Guan
  fullname: Jian Guan
  email: jeff.guan@louisville.edu
  organization: Dept. of Comput. Inf. Syst., Univ. of Louisville, Louisville, KY, USA
– sequence: 3
  givenname: Jozef
  surname: Zurada
  fullname: Zurada, Jozef
  email: jozef.zurada@louisville.edu
  organization: Dept. of Comput. Inf. Syst., Univ. of Louisville, Louisville, KY, USA
BookMark eNo9kE1PAjEYhGuiiYJcvXjpH1jox-62PeICQkKCCXrevNu-q02WQrZFw78XgvE0c5hnMpkBuQ37gIQ8cTbmnJnJ9K2abudjwXgx5vKGDHiujFGF1sU9GcXoGyZKVeZKFw9kU-1jyrYYok_-G-kaoQ8-fNJ239PVroEOgkVHX8DRGTaJziBBxBSpD3SJ0KUvCz3SVXDHmPrTI7lroYs4-tMh-VjM36tltt68rqrpOvMi1ykzuZHSMiisYE4bJrFkLEehndXIQDlwWrOzZUYYxdu8ZI1VunVa8JwXTg7J5Np7DAc4_UDX1Yfe76A_1ZzVlx9qONjz0vryQ83lmXi-Eh4R_8NK6NJoI38BygtdPw
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
ADTOC
UNPAY
DOI 10.1109/APCASE.2015.13
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
Accès UT - IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 2
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
EISBN 1479975885
9781479975884
EndPage 35
ExternalDocumentID oai:dspace.utpl.edu.ec:123456789/18864
7286989
Genre orig-research
GroupedDBID 6IE
6IL
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIB
RIC
RIE
RIL
ADTOC
UNPAY
ID FETCH-LOGICAL-i248t-94933c0a5c20d8903e6004e28dc8e0a7dad8808e0092971f460bc78fd821415d3
IEDL.DBID RIE
IngestDate Thu Aug 28 11:11:33 EDT 2025
Wed Dec 20 05:18:52 EST 2023
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
License other-oa
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i248t-94933c0a5c20d8903e6004e28dc8e0a7dad8808e0092971f460bc78fd821415d3
OpenAccessLink https://proxy.k.utb.cz/login?url=http://dspace.utpl.edu.ec/handle/123456789/18864
PageCount 6
ParticipantIDs unpaywall_primary_10_1109_apcase_2015_13
ieee_primary_7286989
PublicationCentury 2000
PublicationDate 20150701
PublicationDateYYYYMMDD 2015-07-01
PublicationDate_xml – month: 07
  year: 2015
  text: 20150701
  day: 01
PublicationDecade 2010
PublicationTitle 2015 Asia-Pacific Conference on Computer Aided System Engineering
PublicationTitleAbbrev APCASE
PublicationYear 2015
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib026764785
Score 1.6138028
Snippet The research using computational intelligence methods to improve bad debt recovery is imperative due to the rapid increase in the cost of healthcare in the...
SourceID unpaywall
ieee
SourceType Open Access Repository
Publisher
StartPage 30
SubjectTerms Accuracy
bad debt recovey
Classification algorithms
cost-sensitive
imbalanced
Medical services
semi-supervised learning
Semisupervised learning
Testing
Training
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8QwEB10PagXFRW_ycGDl-6mbdomR12VVfALXdBTSTKpiGtdtIusv97Jtq6CF2-BNDTNNDMv4c08gH1exMrqOAsQ_W2VKlygUiMCMnDBMZUSY5_vfHGZ9vri_D65n4FvEUSkbWRde1QNB5PLCmc7dbWBDnlYivWZVJ1QylTMwlyaEPpuwVz_8vrwoanHGHLV0UNLAcCztpK2Vy2YqKYswvyoHOrxhx4MfgWQ0yW4-U7DqXkjz_Ry07aff6sy_ntuy7D2k67HrqeBaAVmXLkKV93X9yq49fx079FYU0j1kRFKZWcvxlMarUN2pJGR06nYsa7og6p39lSy3pQVxhptj_Ea9E9P7rq9oFFPCJ4iIatACRXHluvERhyl4rEjbCNcJNFKx3WGGmnvUpMTQsrCQqTc2EwWKKOQojrG69AqX0u3AQyTIkJjaZzTIioSg8I4b33jkA4k6Sas-jXOh3WBjDyLpNel3ISD6ZpP-yaHDq7y2kq5t1Iexlv_f3QbFnyrpszuQKt6G7ldAgaV2Wv-hS8W6roF
  priority: 102
  providerName: Unpaywall
Title Cost-Sensitive Learning for Imbalanced Bad Debt Datasets in Healthcare Industry
URI https://ieeexplore.ieee.org/document/7286989
http://dspace.utpl.edu.ec/handle/123456789/18864
UnpaywallVersion submittedVersion
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED4VGIAFUIsoL3lgYCDFjfOwx1JABalQqVQqU2T7HISAtKKpEPx67DQEhBjYLOUh53zxne3vuw_giKZMaMliD9HtVonUeCJSgWcHOKUYcY7M8Z37N1FvFFyPw3ENTioujDGmAJ-ZlmsWZ_k40XO3VXYa-9zpHS7BUsyjBVfry3f8KHasybCsy9im4rQz6HaGFw69FbacekGhnrIOq_NsKt_f5PPzj0ByuQH9ry4s8CNPrXmuWvrjV3XG__ZxExrflD0yqILRFtRMVofb7mSWe0OHUXezGimLqT4Qm6mSqxflYI3aIDmTSOzEk5Nzmduols_IY0Z6FTKMlPoe7w0YXV7cdXteqaDgPfoBzz0RCMY0laH2KXJBmbH5TWB8jpobKmOUaP9f26Q2S4rbaRBRpWOeIvfbNrIj24blbJKZHSAYpj4qbZ8zMvDTUGGgjPMAZdAuSqIm1J0xkumiSEZS2qEJx5W9q2vFwoOKRE61_ajEjVDSZrt_v2EP1twNC4jsPiznr3NzYBOBXB0WHnAIK6ObQef-E9VQtqE
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED2VMgALIEB844GBgQQ3cRJnLAXUAgUkQGKLbJ-DECWtaCpUfj12kgaEGNgs5UPO-eI72-_dAzikqR8r4UcOot2tilPtxKFkjhnglGLIOfqW79y_CbuP7PIpeGrAcc2F0VoX4DPt2mZxlo9DNbFbZSeRx63e4RzMB4yxoGRrzbzHCyPLmwyqyowtGp-07zrt-3OL3wpcq19Q6KcswcIkG4nphxgMfoSSi2XozzpRIkhe3UkuXfX5qz7jf3u5AuvfpD1yV4ejVWjobA1uO8Nx7txblLqd10hVTvWZmFyV9N6kBTYqjeRUIDFTT07ORG7iWj4mLxnp1tgwUil8TNfh8eL8odN1Kg0F58VjPHdiFvu-oiJQHkUeU1-bDIdpj6PimooIBZo_2DSpyZOiVspCKlXEU-Rey8R29DegmQ0zvQkEg9RDqcxzWjAvDSQyqa0PSI1mWRJuwZo1RjIqy2QklR224Ki2d32tWHrQOBEjZT4qsSOUtPztv99wAAvdh_51ct27udqBRXtzCZjdhWb-PtF7Ji3I5X7hDV_MK7g-
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8QwEB10PagXFRW_ycGDl-6mbdomR12VVfALXdBTSTKpiGtdtIusv97Jtq6CF2-BNDTNNDMv4c08gH1exMrqOAsQ_W2VKlygUiMCMnDBMZUSY5_vfHGZ9vri_D65n4FvEUSkbWRde1QNB5PLCmc7dbWBDnlYivWZVJ1QylTMwlyaEPpuwVz_8vrwoanHGHLV0UNLAcCztpK2Vy2YqKYswvyoHOrxhx4MfgWQ0yW4-U7DqXkjz_Ry07aff6sy_ntuy7D2k67HrqeBaAVmXLkKV93X9yq49fx079FYU0j1kRFKZWcvxlMarUN2pJGR06nYsa7og6p39lSy3pQVxhptj_Ea9E9P7rq9oFFPCJ4iIatACRXHluvERhyl4rEjbCNcJNFKx3WGGmnvUpMTQsrCQqTc2EwWKKOQojrG69AqX0u3AQyTIkJjaZzTIioSg8I4b33jkA4k6Sas-jXOh3WBjDyLpNel3ISD6ZpP-yaHDq7y2kq5t1Iexlv_f3QbFnyrpszuQKt6G7ldAgaV2Wv-hS8W6roF
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2015+Asia-Pacific+Conference+on+Computer+Aided+System+Engineering&rft.atitle=Cost-Sensitive+Learning+for+Imbalanced+Bad+Debt+Datasets+in+Healthcare+Industry&rft.au=Donghui+Shi&rft.au=Jian+Guan&rft.au=Zurada%2C+Jozef&rft.date=2015-07-01&rft.pub=IEEE&rft.spage=30&rft.epage=35&rft_id=info:doi/10.1109%2FAPCASE.2015.13&rft.externalDocID=7286989