Environmental data stream mining through a case-based stochastic learning approach
Environmental data stream mining is an open challenge for Data Science. Common methods used are static because they analyze a static set of data, and provide static data-driven models. Environmental systems are dynamic and generate a continuous data stream. Dynamic methods coping with the temporal n...
Saved in:
Published in | Environmental modelling & software : with environment data news Vol. 106; pp. 22 - 34 |
---|---|
Main Authors | , |
Format | Journal Article Publication |
Language | English |
Published |
Oxford
Elsevier Ltd
01.08.2018
Elsevier Science Ltd Elsevier |
Subjects | |
Online Access | Get full text |
ISSN | 1364-8152 1873-6726 |
DOI | 10.1016/j.envsoft.2018.01.017 |
Cover
Summary: | Environmental data stream mining is an open challenge for Data Science. Common methods used are static because they analyze a static set of data, and provide static data-driven models. Environmental systems are dynamic and generate a continuous data stream. Dynamic methods coping with the temporal nature of data must be provided in Data Science. Our proposal is to model each environmental information unit, timely generated, as a new case/experience in a Case-Based Reasoning (CBR) system. This contribution aims to incrementally build and manage a Dynamic Adaptive Case Library (DACL). In this paper, a stochastic method for the learning of new cases and management of prototypes to create and manage the DACL in an incremental way is introduced. This stochastic method works with two main moments. An evaluation of the method has been carried using a data stream of air quality of the city of Obregon, Sonora. México, with good results. In addition, other datasets have been mined to ensure the generality of the approach.
•Our stochastic learning approach proposes a new incremental data-driven methodology for environmental data stream mining.•Our dynamical approach is able to identify and adapt upcoming patterns in the environmental data stream (concept drift).•Each new environmental data piece is modelled as a new case in a Dynamic Case-Based Reasoning system.•A Dynamic Adaptive Case Library (DACL) is incrementally created to manage the data stream mining process.•The stochastic learning approach is applied to an air quality assessment problem and additional datasets with good results. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 1364-8152 1873-6726 |
DOI: | 10.1016/j.envsoft.2018.01.017 |