StreamDFP: A General Stream Mining Framework for Adaptive Disk Failure Prediction

We explore machine learning for accurately predicting imminent disk failures and hence providing proactive fault tolerance for modern large-scale storage systems. Current disk failure prediction approaches are mostly offline and assume that the disk logs required for training learning models are ava...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on computers Vol. 72; no. 2; pp. 520 - 534
Main Authors Han, Shujie, Lee, Patrick P. C., Shen, Zhirong, He, Cheng, Liu, Yi, Huang, Tao
Format Journal Article
LanguageEnglish
Published New York IEEE 01.02.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0018-9340
1557-9956
DOI10.1109/TC.2022.3160365

Cover

Abstract We explore machine learning for accurately predicting imminent disk failures and hence providing proactive fault tolerance for modern large-scale storage systems. Current disk failure prediction approaches are mostly offline and assume that the disk logs required for training learning models are available a priori. However, disk logs are often continuously generated as an evolving data stream, in which the statistical patterns vary over time (also known as concept drift). Such a challenge motivates the need of online techniques that perform training and prediction on the incoming stream of disk logs in real time, while being adaptive to concept drift. We first measure and demonstrate the existence of concept drift on various disk models in production. Motivated by our study, we design StreamDFP , a general stream mining framework for disk failure prediction with concept-drift adaptation based on three key techniques, namely online labeling, concept-drift-aware training, and general prediction, with a primary objective of supporting various machine learning algorithms. We extend StreamDFP to support online transfer learning for minority disk models with concept-drift adaptation. Our evaluation shows that StreamDFP improves the prediction accuracy significantly compared to without concept-drift adaptation under various settings, and achieves reasonably high stream processing performance.
AbstractList We explore machine learning for accurately predicting imminent disk failures and hence providing proactive fault tolerance for modern large-scale storage systems. Current disk failure prediction approaches are mostly offline and assume that the disk logs required for training learning models are available a priori. However, disk logs are often continuously generated as an evolving data stream, in which the statistical patterns vary over time (also known as concept drift). Such a challenge motivates the need of online techniques that perform training and prediction on the incoming stream of disk logs in real time, while being adaptive to concept drift. We first measure and demonstrate the existence of concept drift on various disk models in production. Motivated by our study, we design StreamDFP , a general stream mining framework for disk failure prediction with concept-drift adaptation based on three key techniques, namely online labeling, concept-drift-aware training, and general prediction, with a primary objective of supporting various machine learning algorithms. We extend StreamDFP to support online transfer learning for minority disk models with concept-drift adaptation. Our evaluation shows that StreamDFP improves the prediction accuracy significantly compared to without concept-drift adaptation under various settings, and achieves reasonably high stream processing performance.
Author Lee, Patrick P. C.
Shen, Zhirong
Han, Shujie
He, Cheng
Huang, Tao
Liu, Yi
Author_xml – sequence: 1
  givenname: Shujie
  orcidid: 0000-0001-5311-5782
  surname: Han
  fullname: Han, Shujie
  email: shujiehan@pku.edu.cn
  organization: Peking University, Beijing, China
– sequence: 2
  givenname: Patrick P. C.
  orcidid: 0000-0002-4501-4364
  surname: Lee
  fullname: Lee, Patrick P. C.
  email: pclee@cse.cuhk.edu.hk
  organization: Chinese University of Hong Kong, Hong Kong
– sequence: 3
  givenname: Zhirong
  surname: Shen
  fullname: Shen, Zhirong
  email: shenzr@xmu.edu.cn
  organization: Xiamen University, Xiamen, China
– sequence: 4
  givenname: Cheng
  surname: He
  fullname: He, Cheng
  email: hecheng.hc@alibaba-inc.com
  organization: Alibaba Group, Hangzhou, China
– sequence: 5
  givenname: Yi
  surname: Liu
  fullname: Liu, Yi
  email: mars.ly@alibaba-inc.com
  organization: Alibaba Group, Hangzhou, China
– sequence: 6
  givenname: Tao
  surname: Huang
  fullname: Huang, Tao
  email: zuiwu.ht@alibaba-inc.com
  organization: Alibaba Group, Hangzhou, China
BookMark eNp9kL1PwzAUxC1UJNrCzMBiiTntsx3bMVvVkoJURBFljlz3BbkfSXFSEP89qVIxMDA96el-d7rrkU5RFkjINYMBY2CGi_GAA-cDwRQIJc9Il0mpI2Ok6pAuAEsiI2K4IL2qWgOA4mC65OW1Dmh3k3R-R0d0igUGu6Xtkz75whfvNA12h19l2NC8DHS0svvafyKd-GpDU-u3h4B0HnDlXe3L4pKc53Zb4dXp9slber8YP0Sz5-njeDSLHE9MHeVK5lwjdwbEkucJKgVmBUxIYxKbMLvUUopEx0y62DLr-NJinDcy46TTQvTJbeu7D-XHAas6W5eHUDSRGddKKc40l41q2KpcKKsqYJ7tg9_Z8J0xyI67ZYtxdtwtO-3WEPIP4Xxtj83q0JT9h7tpOY-IvylGCy0SED_6THnv
CODEN ITCOB4
CitedBy_id crossref_primary_10_3390_info15060322
Cites_doi 10.1007/s10994-017-5642-8
10.1109/MSST.2013.6558427
10.21236/ada164453
10.1145/3225058.3225106
10.1137/1.9781611972771.42
10.1016/j.artint.2014.06.003
10.1080/01621459.1951.10500769
10.1109/TR.2002.802886
10.1109/ICDM.2018.00197
10.1109/ICSMC.2005.1571498
10.1080/01621459.1963.10500830
10.1109/ACCESS.2019.2935628
10.1109/SRDS.2016.019
10.1021/ci00027a006
10.2307/2333009
10.1145/3337821.3337881
10.1016/0893-6080(89)90020-8
10.1109/DSN48987.2021.00039
10.1109/DSN.2014.44
10.1007/978-3-540-28645-5_29
10.1007/s10618-010-0201-y
10.1007/978-3-642-14400-4_30
10.1023/A:1010933404324
10.1007/BF00058655
10.1145/2820615
10.1007/978-3-642-03915-7_22
10.1109/TC.2016.2538237
10.1145/347090.347107
10.1145/2523813
10.1109/ICDCS47774.2020.00044
10.1007/978-3-540-76928-6_11
10.1109/IJCNN.2016.7727427
10.1145/2939672.2939699
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TC.2022.3160365
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1557-9956
EndPage 534
ExternalDocumentID 10_1109_TC_2022_3160365
9737380
Genre orig-research
GrantInformation_xml – fundername: National Key R&D Program of China
  grantid: 2021YFF0704001
– fundername: Alibaba Group
– fundername: National Natural Science Foundation of China; Natural Science Foundation of China
  grantid: 62072381
  funderid: 10.13039/501100001809
GroupedDBID --Z
-DZ
-~X
.DC
0R~
29I
4.4
5GY
6IK
85S
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
ACNCT
AENEX
AETEA
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
RXW
TAE
TN5
TWZ
UHB
UPT
XZL
YZZ
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c289t-f65f27e2c903b2f8e6609d0135998a81ab755387415c4a1ac2bae4fe669c5c733
IEDL.DBID RIE
ISSN 0018-9340
IngestDate Mon Jun 30 02:46:01 EDT 2025
Wed Oct 01 00:45:29 EDT 2025
Thu Apr 24 22:56:51 EDT 2025
Wed Aug 27 02:20:46 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c289t-f65f27e2c903b2f8e6609d0135998a81ab755387415c4a1ac2bae4fe669c5c733
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-5311-5782
0000-0002-4501-4364
PQID 2766621725
PQPubID 85452
PageCount 15
ParticipantIDs proquest_journals_2766621725
ieee_primary_9737380
crossref_citationtrail_10_1109_TC_2022_3160365
crossref_primary_10_1109_TC_2022_3160365
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-02-01
PublicationDateYYYYMMDD 2023-02-01
PublicationDate_xml – month: 02
  year: 2023
  text: 2023-02-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on computers
PublicationTitleAbbrev TC
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref37
ref14
ref36
ref31
Saad (ref34) 1998; 5
ref11
ref33
ref32
ref2
ref17
Sun (ref35)
ref39
ref16
ref38
ref19
ref18
Freund (ref10)
Kadekodi (ref22)
Breiman (ref7) 2001; 45
ref24
ref23
ref45
ref26
ref20
ref42
Mahdisoltani (ref27)
ref41
ref44
ref21
ref43
ref28
ref8
Bifet (ref4) 2010; 11
ref9
Xu (ref40)
ref3
ref6
ref5
Hamerly (ref15)
Mouss (ref29)
Murray (ref30) 2005; 6
Lu (ref25)
References_xml – volume: 5
  start-page: 6
  year: 1998
  ident: ref34
  article-title: Online algorithms and stochastic approximations
  publication-title: Online Learn.
– ident: ref14
  doi: 10.1007/s10994-017-5642-8
– ident: ref45
  doi: 10.1109/MSST.2013.6558427
– ident: ref17
  doi: 10.21236/ada164453
– ident: ref37
  doi: 10.1145/3225058.3225106
– ident: ref2
  doi: 10.1137/1.9781611972771.42
– start-page: 148
  volume-title: Proc. ACM Int. Conf. Mach. Learn.
  ident: ref10
  article-title: Experiments with a new boosting algorithm
– ident: ref43
  doi: 10.1016/j.artint.2014.06.003
– start-page: 391
  volume-title: Proc. USENIX Annu. Tech. Conf.
  ident: ref27
  article-title: Proactive error prediction to improve storage system reliability
– start-page: 481
  volume-title: Proc. USENIX Conf. USENIX Annu. Tech. Conf.
  ident: ref40
  article-title: Improving service availability of cloud systems by predicting disk error
– ident: ref28
  doi: 10.1080/01621459.1951.10500769
– start-page: 1
  volume-title: Proc. 56th ACM/IEEE Des. Automat. Conf.
  ident: ref35
  article-title: System-level hardware failure prediction using deep learning
– ident: ref20
  doi: 10.1109/TR.2002.802886
– ident: ref42
  doi: 10.1109/ICDM.2018.00197
– ident: ref31
  doi: 10.1109/ICSMC.2005.1571498
– ident: ref18
  doi: 10.1080/01621459.1963.10500830
– ident: ref13
  doi: 10.1109/ACCESS.2019.2935628
– start-page: 202
  volume-title: Proc. ACM Int. Conf. Mach. Learn.
  ident: ref15
  article-title: Bayesian approaches to failure prediction for disk drives
– ident: ref24
  doi: 10.1109/SRDS.2016.019
– ident: ref36
  doi: 10.1021/ci00027a006
– volume: 11
  start-page: 1601
  issue: May
  year: 2010
  ident: ref4
  article-title: MOA: Massive online analysis
  publication-title: J. Mach. Learn. Res.
– ident: ref32
  doi: 10.2307/2333009
– ident: ref41
  doi: 10.1145/3337821.3337881
– ident: ref19
  doi: 10.1016/0893-6080(89)90020-8
– ident: ref39
  doi: 10.1109/DSN48987.2021.00039
– ident: ref23
  doi: 10.1109/DSN.2014.44
– ident: ref11
  doi: 10.1007/978-3-540-28645-5_29
– ident: ref21
  doi: 10.1007/s10618-010-0201-y
– start-page: 345
  volume-title: Proc. USENIX Conf. File Storage Technol.
  ident: ref22
  article-title: Cluster storage systems gotta have HeART: Improving storage efficiency by exploiting disk-reliability heterogeneity
– ident: ref44
  doi: 10.1007/978-3-642-14400-4_30
– volume: 45
  start-page: 5
  issue: 1
  year: 2001
  ident: ref7
  article-title: Random forests
  publication-title: Mach. Learn.
  doi: 10.1023/A:1010933404324
– ident: ref6
  doi: 10.1007/BF00058655
– ident: ref26
  doi: 10.1145/2820615
– ident: ref3
  doi: 10.1007/978-3-642-03915-7_22
– start-page: 815
  volume-title: Proc. IEEE Asian Control Conf.
  ident: ref29
  article-title: Test of Page-Hinckley, an approach for fault detection in an agro-alimentary production system
– ident: ref38
  doi: 10.1109/TC.2016.2538237
– ident: ref9
  doi: 10.1145/347090.347107
– ident: ref12
  doi: 10.1145/2523813
– ident: ref16
  doi: 10.1109/ICDCS47774.2020.00044
– volume: 6
  start-page: 783
  issue: May
  year: 2005
  ident: ref30
  article-title: Machine learning methods for predicting failures in hard drives: A multiple-instance application
  publication-title: J. Mach. Learn. Res.
– ident: ref33
  doi: 10.1007/978-3-540-76928-6_11
– start-page: 151
  volume-title: Proc. USENIX Conf. File Storage Technol.
  ident: ref25
  article-title: Making disk failure predictions SMARTer!
– ident: ref8
  doi: 10.1109/IJCNN.2016.7727427
– ident: ref5
  doi: 10.1145/2939672.2939699
SSID ssj0006209
Score 2.3971899
Snippet We explore machine learning for accurately predicting imminent disk failures and hence providing proactive fault tolerance for modern large-scale storage...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 520
SubjectTerms Adaptation
Adaptation models
Algorithms
and online transfer learning
concept drift
Data transmission
Disk failure prediction
Drift
Failure
Fault tolerance
Machine learning
Machine learning algorithms
Prediction algorithms
Predictions
Predictive models
Production
Random forests
Storage systems
stream mining
Training
Title StreamDFP: A General Stream Mining Framework for Adaptive Disk Failure Prediction
URI https://ieeexplore.ieee.org/document/9737380
https://www.proquest.com/docview/2766621725
Volume 72
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1557-9956
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006209
  issn: 0018-9340
  databaseCode: RIE
  dateStart: 19680101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGP2YO-nB6aY4nZKDBw92S9qmWb2NzTKEyYQNditJmsKYbmM_Lv71Jm1a_AneSkgg8JIv72u-vAdwqwiRPhPCEURxx09C7IiEeY7ClOvTQnKSmrfDo-dgOPWfZnRWgfvyLYxSKis-U23zmd3lJyu5N7_KOiEzOjw6QT9g3SB_q1VG3aAo5yB6A3s-tjI-BIedSV_nga6r09NAx2v65QTKLFV-xOHscIlqMCqmldeULNr7nWjL92-Kjf-d9wkcW5aJevmyOIWKWtahVjg4ILuh63D0SY6wAS_mipq_DaLxA-ohK0iN8kY0ypwkUFQUcyHNdlEv4WsTL9Fgvl2giM9NkTsab8ztj0H8DKbR46Q_dKzlgiN15rVz0oCmLlOuDLEn3LSrggCHiaaJVKdlvEu4YFSHSENDpM8Jl67gyk91t1BSyTzvHKrL1VJdACIJ5i4VMkkV9lNueIGmSoynUgREkLAJ7QKGWFo9cmOL8RpneQkO40k_NrjFFrcm3JUD1rkUx99dGwaFspsFoAmtAufYbtVt7DKdwRmbLnr5-6grODQe83mpdguqu81eXWsmshM32RL8ANzt2PM
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGP0YelAPTqfidGoOHjzYmbRJu3ob0zJ1FYUK3kqSpjCmm-zHxb_epE2HP8FbCQkEXvLlfc2X9wBOFSGSBkI4giju0CzEjsgCz1GYcX1aSE5y83Y4vvf7T_T2mT3X4Hz5FkYpVRSfqbb5LO7ys4lcmF9lF2FgdHh0gr7KKKWsfK21jLt-VdBB9Bb2KLZCPgSHF0lPZ4KuqxNUX0ds9uUMKkxVfkTi4niJ6hBXEyurSkbtxVy05fs3zcb_znwLNi3PRN1yYWxDTY0bUK88HJDd0g3Y-CRIuAOP5pKav15FD5eoi6wkNSobUVx4SaCoKudCmu-ibsbfTMREV8PZCEV8aMrc0cPU3P8YzHfhKbpOen3Hmi44Uudecyf3We4GypUh9oSbd5Tv4zDTRJHpxIx3CBcB00HSEBFJOeHSFVzRXHcLJZOB5-3ByngyVvuASIa5y4TMcoVpzg0z0GQp4LkUPhEkbEK7giGVVpHcGGO8pEVmgsM06aUGt9Ti1oSz5YC3Uozj7647BoVlNwtAE1oVzqndrLPUDXQOZ4y62MHvo05grZ_Eg3Rwc393COvGcb4s3G7Byny6UEeal8zFcbEcPwBkzdxA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=StreamDFP%3A+A+General+Stream+Mining+Framework+for+Adaptive+Disk+Failure+Prediction&rft.jtitle=IEEE+transactions+on+computers&rft.au=Han%2C+Shujie&rft.au=Lee%2C+Patrick+P.+C.&rft.au=Shen%2C+Zhirong&rft.au=He%2C+Cheng&rft.date=2023-02-01&rft.issn=0018-9340&rft.eissn=1557-9956&rft.volume=72&rft.issue=2&rft.spage=520&rft.epage=534&rft_id=info:doi/10.1109%2FTC.2022.3160365&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TC_2022_3160365
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9340&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9340&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9340&client=summon