GRU-DF: A Temporal Model with Dynamic Imputation for Missing Target Values in Longitudinal Patient Data

Temporal models are desirable in studying progressive diseases because the data are typically collected at regular time intervals. However, such clinical data often contain many missing entries, including those from the target variable that we are interested in predicting. Standard imputation techni...

Full description

Saved in:
Bibliographic Details
Published inProceedings (IEEE International Conference on Healthcare Informatics. Online) pp. 1 - 7
Main Authors Zhao, Yijun, Berretta, Matias, Wang, Tong, Chitnis, Tanuja
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2020
Subjects
Online AccessGet full text
ISSN2575-2634
DOI10.1109/ICHI48887.2020.9374359

Cover

Abstract Temporal models are desirable in studying progressive diseases because the data are typically collected at regular time intervals. However, such clinical data often contain many missing entries, including those from the target variable that we are interested in predicting. Standard imputation techniques (e.g., linear interpolation) are inappropriate in treating missing target observations because they approximate the missing entries before the onset of model training and, thus, would inevitably lead to training a self-fulfilling model. The absence of target observations is particularly problematic for time series data where their availability at each time step is indispensable in building a temporal model. We propose a novel approach that incorporates the missing target value imputation into the training process of the Gated Recurrent Unit (GRU) model. We evaluate our new model in our motivating domain of predicting disease progression of multiple sclerosis patients using a real-world dataset of 508 subjects. The goal is to forecast patients' disability levels based on data collected in six-month intervals. Our model demonstrates a 27.9% performance gain over a GRU model with a standard forward-fill treatment for the missing target observations. Additionally, our model displays a 21.6% advantage over a non-temporal approach for our machine learning task.
AbstractList Temporal models are desirable in studying progressive diseases because the data are typically collected at regular time intervals. However, such clinical data often contain many missing entries, including those from the target variable that we are interested in predicting. Standard imputation techniques (e.g., linear interpolation) are inappropriate in treating missing target observations because they approximate the missing entries before the onset of model training and, thus, would inevitably lead to training a self-fulfilling model. The absence of target observations is particularly problematic for time series data where their availability at each time step is indispensable in building a temporal model. We propose a novel approach that incorporates the missing target value imputation into the training process of the Gated Recurrent Unit (GRU) model. We evaluate our new model in our motivating domain of predicting disease progression of multiple sclerosis patients using a real-world dataset of 508 subjects. The goal is to forecast patients' disability levels based on data collected in six-month intervals. Our model demonstrates a 27.9% performance gain over a GRU model with a standard forward-fill treatment for the missing target observations. Additionally, our model displays a 21.6% advantage over a non-temporal approach for our machine learning task.
Author Zhao, Yijun
Berretta, Matias
Chitnis, Tanuja
Wang, Tong
Author_xml – sequence: 1
  givenname: Yijun
  surname: Zhao
  fullname: Zhao, Yijun
  organization: Fordham University,Computer and Information Science Department,New York,NY,USA
– sequence: 2
  givenname: Matias
  surname: Berretta
  fullname: Berretta, Matias
  organization: Fordham University,Computer and Information Science Department,New York,NY,USA
– sequence: 3
  givenname: Tong
  surname: Wang
  fullname: Wang, Tong
  organization: Fordham University,Computer and Information Science Department,New York,NY,USA
– sequence: 4
  givenname: Tanuja
  surname: Chitnis
  fullname: Chitnis, Tanuja
  organization: Brigham and Women's Hospital,Harvard Medical School,Department of Neurology,Boston,MA,USA
BookMark eNotkNFKwzAYRqMoOOeeQJC8QGeSP2kS78bmtkKHItXbkbZJjbTpaDNkb-_AXZ2r78B37tFN6INF6ImSOaVEP2fLbcaVUnLOCCNzDZKD0FdopqWikikqQDF5jSZMSJGwFPgdmo3jDyEEqCYkJRPUbD4-k9X6BS9wYbtDP5gW7_ratvjXx2-8OgXT-Qpn3eEYTfR9wK4f8M6Pow8NLszQ2Ii_THu0I_YB531ofDzWPpw97-eBDRGvTDQP6NaZdrSzC6eoWL8Wy22Sv22y5SJPPCMQE6qAGnA8LZWwUgIox8BpoU1aVSR1FEyZlkZVzJZC8vNF5oARbnVdc7AwRY__Wm-t3R8G35nhtL-UgT-bQFk8
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICHI48887.2020.9374359
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
EISBN 9781728153827
1728153824
EISSN 2575-2634
EndPage 7
ExternalDocumentID 9374359
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i203t-1831a3f46b85e77338f23f959a6cc06f13ab6ba8c2eb5748152f3204e9dd43e3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:46:14 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-1831a3f46b85e77338f23f959a6cc06f13ab6ba8c2eb5748152f3204e9dd43e3
PageCount 7
ParticipantIDs ieee_primary_9374359
PublicationCentury 2000
PublicationDate 2020-Nov.
PublicationDateYYYYMMDD 2020-11-01
PublicationDate_xml – month: 11
  year: 2020
  text: 2020-Nov.
PublicationDecade 2020
PublicationTitle Proceedings (IEEE International Conference on Healthcare Informatics. Online)
PublicationTitleAbbrev ICHI
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003190060
Score 1.7955859
Snippet Temporal models are desirable in studying progressive diseases because the data are typically collected at regular time intervals. However, such clinical data...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Data models
disease progression
gated recurrent unit (GRU)
longitudinal data
missing value imputation
Multiple sclerosis
Performance gain
Predictive models
recurrent neural network (RNN)
temporal model
time series
Time series analysis
Training
Title GRU-DF: A Temporal Model with Dynamic Imputation for Missing Target Values in Longitudinal Patient Data
URI https://ieeexplore.ieee.org/document/9374359
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKB8TER4v41g2MJE3txEnYUEtpEUEVSlG3ynZsVFGlCCULv55zEopADGxRpHzI5-Td2ffeI-QyyCIdc2Ubnnzq-MooR-gMixWOv8yIKhZwy3dOHvl45t_Pg3mLXG24MFrrqvlMu_aw2svP1qq0S2U9hFJE93iLbIURr7lam_UUnEpWW6QhAfe9uDcZjCc4PaMQq0Dquc3FP1xUKhAZ7ZLk6_F178irWxbSVR-_lBn_-357pPtN14PpBoj2SUvnB2Q7aXbNO-Tl7mnmDEfXcANprUS1AuuBtgK7CgvD2pQeJtbfoQoUYCYLCYYEbwdp1SsOz2KFCALLHB7W1uOozKyfFkxrXVYYikJ0STq6TQdjpzFYcJbUY4WDn3NfMONzGQU6DLFaNZSZOIgFV8rjps-E5FJEimoZhFbWhRpGPV_HWeYzzQ5JO1_n-oiAEJ6PmZtRCISYIoiIY91lhXakNhJLsmPSscO1eKslNBbNSJ38ffqU7NiQ1ZS_M9Iu3kt9jthfyIsq6J8-BqzN
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT4NAEN7UmqgnH63x7R48SkvZB-DNtFbQ0jSGmt6aZdk1jQ01Bi7-emcBazQevBESyGZn4JuZnfk-hK5Y6imfS9PwRB2LSi0toVJIVjj8Mj1HEsbNvHM05sGUPszYrIGu17MwSqmy-Ux1zGV5lp-uZGFKZV2AUkB3fwNtMkopq6a11hUVcCbDLlKPAfdsvxv2gxAc1HMhD3TsTv34Dx2VEkaGuyj6WkDVPfLaKfKkIz9-cTP-d4V7qP09sIcnayjaRw2VHaCtqD43b6GX-6epNRje4FscV1xUS2xU0JbY1GHxoJKlx6FReChNhSGWxREYBV6H47JbHD-LJWAIXmR4tDIqR0VqFLXwpGJmxQORizaKh3dxP7BqiQVr4dgkt-CD7gmiKU88plwX8lXtEO0zX3Apba57RCQ8EZ50VMJcQ-ziaOLYVPlpSokih6iZrTJ1hLAQNoXYTUuAQggShMch8zJUO4nSCSRlx6hltmv-VpFozOudOvn79iXaDuJoNB-F48dTtGPMVw0AnqFm_l6oc4gE8uSidIBPe4uwGg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+International+Conference+on+Healthcare+Informatics.+Online%29&rft.atitle=GRU-DF%3A+A+Temporal+Model+with+Dynamic+Imputation+for+Missing+Target+Values+in+Longitudinal+Patient+Data&rft.au=Zhao%2C+Yijun&rft.au=Berretta%2C+Matias&rft.au=Wang%2C+Tong&rft.au=Chitnis%2C+Tanuja&rft.date=2020-11-01&rft.pub=IEEE&rft.eissn=2575-2634&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FICHI48887.2020.9374359&rft.externalDocID=9374359