Big Data Reduction Methods: A Survey

Research on big data analytics is entering in the new phase called fast data where multiple gigabytes of data arrive in the big data systems every second. Modern big data systems collect inherently complex data streams due to the volume, velocity, value, variety, variability, and veracity in the acq...

Full description

Saved in:

Bibliographic Details
Published in	Data science and engineering Vol. 1; no. 4; pp. 265 - 284
Main Authors	ur Rehman, Muhammad Habib, Liew, Chee Sun, Abbas, Assad, Jayaraman, Prem Prakash, Wah, Teh Ying, Khan, Samee U.
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.12.2016 Springer Nature B.V
Subjects	Algorithm Analysis and Problem Complexity Artificial Intelligence Big Data Chemistry and Earth Sciences Computer Science Data acquisition Data compression Data mining Data Mining and Knowledge Discovery Data reduction Data systems Data transmission Database Management Machine learning Physics Redundancy Statistics for Engineering Systems and Data Security Dimensionality reduction Data complexity Big data Data compression Data reduction
Online Access	Get full text
ISSN	2364-1185 2364-1541
DOI	10.1007/s41019-016-0022-0

Cover

More Information
Summary:	Research on big data analytics is entering in the new phase called fast data where multiple gigabytes of data arrive in the big data systems every second. Modern big data systems collect inherently complex data streams due to the volume, velocity, value, variety, variability, and veracity in the acquired data and consequently give rise to the 6Vs of big data. The reduced and relevant data streams are perceived to be more useful than collecting raw, redundant, inconsistent, and noisy data. Another perspective for big data reduction is that the million variables big datasets cause the curse of dimensionality which requires unbounded computational resources to uncover actionable knowledge patterns. This article presents a review of methods that are used for big data reduction. It also presents a detailed taxonomic discussion of big data reduction methods including the network theory, big data compression, dimension reduction, redundancy elimination, data mining, and machine learning methods. In addition, the open research issues pertinent to the big data reduction are also highlighted.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2364-1185 2364-1541
DOI:	10.1007/s41019-016-0022-0