An integrated framework for diagnosing process faults with incomplete features

Handling missing values and large-dimensional features are crucial requirements for data-driven fault diagnosis systems. However, most intelligent data-driven diagnostic systems are not able to handle missing data. The presence of high-dimensional feature sets can also further complicate the process...

Full description

Saved in:

Bibliographic Details
Published in	Knowledge and information systems Vol. 64; no. 1; pp. 75 - 93
Main Authors	Razavi-Far, Roozbeh, Saif, Mehrdad, Palade, Vasile, Chakrabarti, Shiladitya
Format	Journal Article
Language	English
Published	London Springer London 01.01.2022 Springer Nature B.V
Subjects	Algorithms Classification Computer Science Data Mining and Knowledge Discovery Database Management Decision making Diagnostic software Diagnostic systems Discriminant analysis Fault diagnosis Information Storage and Retrieval Information Systems and Communication Service Information Systems Applications (incl.Internet) IT in Business Missing data Modules Principal components analysis Reduction Regular Paper Missing data imputation Dimensionality reduction Fault diagnosis Data analysis Principal component analysis Heteroscedastic discriminant analysis
Online Access	Get full text
ISSN	0219-1377 0219-3116
DOI	10.1007/s10115-021-01625-w

Cover

More Information
Summary:	Handling missing values and large-dimensional features are crucial requirements for data-driven fault diagnosis systems. However, most intelligent data-driven diagnostic systems are not able to handle missing data. The presence of high-dimensional feature sets can also further complicate the process of fault diagnosis. This paper aims to devise a missing data imputation unit along with a dimensionality reduction unit in the pre-processing module of the diagnostic system. This paper proposes a novel pooling strategy for missing data imputation (PSMI). This strategy can simplify complex patterns of missingness and incrementally update the pool. The pre-processing module receives incomplete observations, PSMI estimates missing values, and, then, the dimensionality reduction unit transforms completed observations onto a lower-dimensional feature space. These transformed observations are then fed as inputs to the fault classification module for decision making and diagnosis. This diagnostic scheme makes use of various state-of-the-art missing data imputation, dimensionality reduction and classification algorithms. This enables a comprehensive comparison and allows to find the best techniques for the sake of diagnosing faults in the Tennessee Eastman process. The obtained results show the effectiveness of the proposed pooling strategy and indicate that principal component analysis imputation and heteroscedastic discriminant analysis approaches outperform other imputation and dimensionality reduction techniques in this diagnostic application.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0219-1377 0219-3116
DOI:	10.1007/s10115-021-01625-w