A Data Processing Algorithm to Approximate Big Stream Data Analysis
Sampling large datasets for approximate data processing and analysis is a critical challenge in data management. This becomes increasingly complex with large, continuous data streams, especially when data is distributed across multiple sites and exceeds available resources. The key challenge lies in...
        Saved in:
      
    
          | Published in | 2025 International Conference on Machine Intelligence and Smart Innovation (ICMISI) pp. 247 - 250 | 
|---|---|
| Main Authors | , , , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        10.05.2025
     | 
| Subjects | |
| Online Access | Get full text | 
| DOI | 10.1109/ICMISI65108.2025.11115782 | 
Cover
| Summary: | Sampling large datasets for approximate data processing and analysis is a critical challenge in data management. This becomes increasingly complex with large, continuous data streams, especially when data is distributed across multiple sites and exceeds available resources. The key challenge lies in ensuring that the sampled data retains statistical characteristics similar to the entire dataset. In this paper, we propose an efficient algorithm designed to approximate big stream data analysis. The algorithm segments streaming data into randomized blocks, preserving the statistical properties of the original dataset. By dividing data streams into time-based windows and applying randomized sampling techniques, statistically consistent blocks are generated. Experimental results demonstrate the algorithm's effectiveness, highlighting its ability to reduce computational overhead, improve scalability, and operate effectively in IoT and sensor network environments. By enabling parallel processing of these consistent blocks, the proposed approach addresses real-time data handling challenges and large-scale analytics, paving the way for advancements in adaptive and distributed data stream management. | 
|---|---|
| DOI: | 10.1109/ICMISI65108.2025.11115782 |