Dataflow Processing

Since its first volume in 1960, Advances in Computers has presented detailed coverage of innovations in computer hardware, software, theory, design, and applications. It has also provided contributors with a medium in which they can explore their subjects in greater depth and breadth than journal ar...

Full description

Saved in:

Bibliographic Details
Main Authors	Namasudra, Suyel, Milutinovic, Veljko
Format	eBook
Language	English
Published	Chantilly Elsevier Science & Technology 2015 Academic Press
Edition	1
Subjects	Computer architecture Data flow computing
Online Access	Get full text
ISBN	9780128021347 0128021349

Cover

Table of Contents:

4.6. Hybrid Dataflow/Controlflow Architectures -- 4.6.1. Coarse-Grain Dataflow -- 4.6.2. Complex Dataflow -- 4.6.3. RISC Dataflow -- 4.6.4. Threaded Dataflow -- 5. Recent Dataflow Architectures -- 5.1. Tera-op Reliable Intelligently Adaptive Processing System -- 5.2. Data-Driven Workstation Network -- 5.3. WaveScalar -- 5.4. Teraflux -- 5.5. Maxeler -- 5.6. Codelet -- 5.7. Recent Architectures Summary -- 6. Case Study: Scheduled Dataflow -- 6.1. Organization -- 6.2. Instruction Format -- 6.3. Execution Paradigm -- 6.4. Architecture -- 6.4.1. Synchronization Pipeline -- 6.4.2. Execution Pipeline -- 6.4.3. Scheduling Unit -- 6.5. Support for Thread-Level Speculation -- 6.5.1. TLS Schema for SDF -- 6.5.2. Speculation Extensions to SDF -- 7. Conclusions -- References -- About the Author -- Chapter 3: Dataflow Computing in Extreme Performance Conditions -- 1. Introduction -- 2. Dataflow Computing -- 3. Maxeler Multiscale DFEs -- 4. Development Process -- 4.1. Analysis -- 4.2. Transformation -- 4.3. Partitioning -- 4.4. Implementation -- 5. Programming with MaxCompiler -- 6. Dataflow Clusters -- 6.1. Power Efficient Computing -- 6.2. Data Storage -- 6.3. Interconnections -- 6.4. Cluster-Level Management -- 6.5. Reliability -- 7. Case Study: Meteorological Limited-Area Model -- 7.1. The Application -- 7.2. The Model -- 7.3. The Algorithm -- 7.4. Analysis -- 7.5. Partitioning -- 7.6. Transformation -- 7.7. Parallelization -- 7.8. Experimental Setup -- 7.9. Results -- 7.10. Speedup -- 8. Conclusion -- References -- About the Author -- Chapter 4: Sorting Networks on Maxeler Dataflow Supercomputing Systems -- 1. Introduction -- 2. Motivation -- 3. Sorting Algorithms -- 3.1. Sequential Sorting -- 3.1.1. Optimal Algorithm for Small Input Data Sizes -- 3.2. Parallel Sorting -- 3.3. Network Sorting -- 4. Sorting Networks -- 4.1. Basic Properties
7.1. Basic Design of the Solution -- 7.2. Expected Improvement Analysis -- 8. Determining Locality -- 8.1. Compile-Time Resolution -- 8.2. Profile-Time Resolution -- 8.3. Run-Time Resolution -- 9. Modified STS in a Multicore System -- 10. Conditions and Assumptions of the Analysis Below -- 11. Simulation Strategy -- 11.1. Overall Performance -- 11.2. Effects of the Temporal Latency -- 11.3. Effects of the Temporal Cache Associativity -- 11.4. The X Limit Parameter -- 11.5. Power Consumption -- 12. Conclusions of the Analysis Part -- 13. The Table of Abbreviations -- Acknowledgments -- References -- About the Author -- Author Index -- Subject Index -- Contents of Volumes in This Series
Front Cover -- Dataflow Processing -- Copyright -- Contents -- Preface -- Chapter 1: An Overview of Selected Heterogeneous and Reconfigurable Architectures -- 1. Introduction -- 2. Problem Statement -- 3. Existing Solutions and Their Criticism -- 3.1. Presentation of Existing Solutions -- 3.1.1. NVIDIA Fermi GPU -- 3.1.2. AMD ATI -- 3.1.3. Cell Broadband Engine Architecture -- 3.1.4. ClearSpeed -- 3.1.5. Maxeler Dataflow Engines -- 3.1.6. SGI's Reconfigurable Application-Specific Computing -- 3.1.7. Convey´s Reconfigurable COP -- 3.2. Classification of Presented Solutions -- 4. Summary of Presented Solutions -- 4.1. Computation and Memory Capacity -- 4.2. Programming Considerations -- 5. Performance Comparison of Presented Solutions -- 5.1. Analytical Comparison -- 5.2. Experimental Comparison -- 6. Conclusion -- Acknowledgments -- References -- About the Author -- Chapter 2: Concurrency, Synchronization, and Speculation-The Dataflow Way -- 1. Introduction -- 2. Dataflow Concepts -- 2.1. Dataflow Formalism -- 2.2. Generalized Firing Semantic Sets (FSS) -- 2.3. Graphical Notations -- 2.4. Concurrency -- 2.5. Synchronization -- 2.6. Speculation or Greedy Execution -- 3. Dataflow Languages -- 3.1. Dataflow Structures -- 3.1.1. Direct Access Method -- 3.1.2. Indirect Access Method -- 3.1.3. Dennis´s Method -- 3.1.4. I-Structures -- 3.1.5. Hybrid Scheme -- 3.2. Id: Irvine Dataflow Language -- 3.3. VAL -- 3.4. Streams and Iteration in a Single Assignment Language -- 4. Historical Dataflow Architectures -- 4.1. Dataflow Instructions -- 4.1.1. Classical Architectures -- 4.2. Static Dataflow Architectures -- 4.3. Dynamic Dataflow Architectures -- 4.4. Explicit Token Store -- 4.5. Dataflow Limitations -- 4.5.1. Localities and Memory Hierarchy -- 4.5.2. Granularity of Computations -- 4.5.3. Synchronous Execution -- 4.5.4. Memory Ordering
4.2. Constructing Sorting Networks -- 4.3. Testing Sorting Networks -- 4.4. Network Sorting Algorithms -- 4.4.1. Bubble Sorting -- 4.4.2. Odd-Even Sorting -- 4.4.3. Bitonic Merge Sorting -- 4.4.4. Odd-Even Merge Sorting -- 4.4.5. Pairwise Sorting -- 4.5. Comparison of Network Sorting Algorithms -- 4.6. Network Sorting Algorithm Operation Example -- 4.7. Network Sorting Versus Sequential and Parallel Sorting -- 5. Implementation -- 5.1. Dataflow Computer Code (MAX2 Card) -- 5.2. Control Flow Computer Code (Host PC) -- 6. Setting Up the Experiment -- 7. Experimental Results -- 7.1. FPGA Usage -- 7.2. Performance of Network Sorting Algorithms on the MAX2 Card -- 7.3. Comparison of Sorting Algorithms on the MAX2 Card and the Host CPU -- 8. Conclusion -- References -- About the Author -- Chapter 5: Dual Data Cache Systems: Architecture and Analysis -- 1. Introduction -- 2. A DDC Systems Classification Proposal -- 3. Existing DDC Systems -- 3.1. General Uniprocessor Compiler-Not-Assisted -- 3.1.1. The DDC -- 3.1.2. The STS Data Cache -- 3.2. General Uniprocessor Compiler-Assisted -- 3.2.1. The Northwestern Solution -- 3.3. General Multiprocessor Compiler-Not-Assisted -- 3.3.1. The Split Data Cache in Multiprocessor System -- 3.4. General Multiprocessor Compiler-Assisted -- 3.5. Special Uniprocessor Compiler-Not-Assisted -- 3.5.1. The Reconfigurable Split Data Cache -- 3.6. Special Uniprocessor Compiler-Assisted -- 3.6.1. The Data Type-Dependent Cache for MPEG Application -- 3.7. Special Multiprocessor Compiler-Not-Assisted -- 3.7.1. The Texas Solution -- 3.8. Special Multiprocessor Compiler-Assisted -- 3.8.1. The Time-Predictable Data Cache -- 3.9. A Comparison of the Existing Solutions -- 4. Conclusion of the Survey Part -- 5. Problem Statement for the Analysis -- 6. Critical Analysis of Existing Solutions -- 7. Generalized Solution