Utility of Multiclass Machine Learning Algorithms in Predicting Same-Day Discharge Following Primary Total Knee Arthroplasty

Length of stay (LOS) is a substantial driver of costs following primary total knee arthroplasty (TKA), leading to increased efforts targeting same-day discharge (SDD). However, patient selection for SDD TKA remains a challenge, with 7 to 49% of patients failing to achieve planned SDD with current st...

Full description

Saved in:
Bibliographic Details
Published inThe Journal of arthroplasty
Main Authors Chen, Shane F., Buddhiraju, Anirudh, Chen, Tony L.-W., Ilyas, Muhammad H., Shimizu, Michelle, Kwon, Young-Min
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 03.09.2025
Subjects
Online AccessGet full text
ISSN0883-5403
1532-8406
DOI10.1016/j.arth.2025.08.072

Cover

More Information
Summary:Length of stay (LOS) is a substantial driver of costs following primary total knee arthroplasty (TKA), leading to increased efforts targeting same-day discharge (SDD). However, patient selection for SDD TKA remains a challenge, with 7 to 49% of patients failing to achieve planned SDD with current stratification tools. This study aimed to develop and assess multiclass machine learning (ML) models for patient selection for SDD TKA as well as risk for prolonged LOS using a large national patient cohort. The database was queried to identify 167,859 primary TKAs between 2017 and 2023. The LOS was categorized into SDD (LOS = zero days), discharge within one to three days, and prolonged LOS (> three days). Machine learning models, including artificial neural networks, random forests (RF), k-nearest neighbors, and XGBoost, were developed and evaluated using the confusion matrix, Cohen's kappa, and the area under the receiver operating characteristic curve. Same-day discharge, discharge within one to three days, and prolonged LOS rates were 2.1, 88.1, and 9.8%, respectively. The RF demonstrated the best performance in predicting different LOS groups with an average precision of 90.2% and a recall of 90.3%. For multiclass classification, RF had an accuracy of 90.3%, a Cohen's kappa of 0.85, and a micro-averaged area under the receiver operating characteristic curve of 0.97. Prominent predictors of LOS included anesthesia type, sex, body mass index, American Society of Anesthesiologists score, hypertension, age, and operation time. Our findings demonstrate the ability of ML models to accurately identify SDD-eligible and at-risk patients for prolonged LOS after primary TKA. Our models may assist surgeons with patient selection for outpatient surgery, thereby improving outcomes, resource allocation, and cost efficiency of TKA.
ISSN:0883-5403
1532-8406
DOI:10.1016/j.arth.2025.08.072