Role of big-data in classification and novel class detection in data streams

“Data streams” is defined as class of data generated over “text, audio and video” channel in continuous form. The streams are of infinite length and may comprise of structured or unstructured data. With these features, it is difficult to store and process data streams with simple and static strategi...

Full description

Saved in:
Bibliographic Details
Published inJournal of big data Vol. 3; no. 1; pp. 1 - 9
Main Author Chandak, M. B.
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 05.03.2016
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN2196-1115
2196-1115
DOI10.1186/s40537-016-0040-9

Cover

More Information
Summary:“Data streams” is defined as class of data generated over “text, audio and video” channel in continuous form. The streams are of infinite length and may comprise of structured or unstructured data. With these features, it is difficult to store and process data streams with simple and static strategies. The processing of data stream poses four main challenges to researchers. These are infinite length, concept-evolution, concept-drift and feature evolution. Infinite-length is because the amount of data has no bounds. Concept-drift is due to slow changes in the concept of stream. Concept-evolution occurs due to presence of unknown classes in data. Feature-evolution is due to progression new features and regression of old features. To perform any analytics data streams, the conversion to knowledgable form is essential. The researcher in past have proposed various strategies, most of the research is focussed on problem of infinite-length and concept-drift. The research work presented in the paper describes a efficient string based methodology to process “data streams” and control the challenges of infinite-length, concept-evolution and concept-drift. Subject areas Data mining, Machine learning
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2196-1115
2196-1115
DOI:10.1186/s40537-016-0040-9