Data mining algorithm for pre-processing biopharmaceutical drug product manufacturing records
•The preprocessing algorithm removes the noise from a continuously measured dataset.•The dataset is visualized as a DNA-strain and the process recipe as a gene sequence.•The real-time integration of the algorithm in the operations assures data integrity.•The outcome is a noise-free and structured da...
Saved in:
| Published in | Computers & chemical engineering Vol. 124; pp. 253 - 269 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier Ltd
08.05.2019
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0098-1354 1873-4375 |
| DOI | 10.1016/j.compchemeng.2018.12.001 |
Cover
| Summary: | •The preprocessing algorithm removes the noise from a continuously measured dataset.•The dataset is visualized as a DNA-strain and the process recipe as a gene sequence.•The real-time integration of the algorithm in the operations assures data integrity.•The outcome is a noise-free and structured data suitable for making decisions.•A new depiction of root causes provides a fast and quantitative decision-making.
The quality of data plays a crucial role in providing a reliable decision-making process when improving processes and operations under uncertainty. We present a data mining-based algorithm for robustly pre-processing the manufacturing records of biopharmaceutical batch processes. The algorithm can identify the time intervals in which the process is in commercial operation, and can characterize process failures automatically. An approximate string-matching algorithm, a decision tree classifier and a constrained clustering is applied to sequence the raw data, to classify the noise and identify each single batches; finally process failure are characterized. The algorithm was applied to the records of the process named as “cleaning- and sterilizing-in-place”, which is an essential process in manufacturing environment, in a case study. The algorithm was training on state of the art manual pre-processing outcome and was applied reducing the execution time of the activity down to 11.7% while maintaining high data quality and integrity. |
|---|---|
| ISSN: | 0098-1354 1873-4375 |
| DOI: | 10.1016/j.compchemeng.2018.12.001 |