Provenance and Annotation of Data and Processes Second International Provenance and Annotation Workshop, IPAW 2008, Salt Lake City, UT, USA, June 17-18 2008
This book constitutes the thoroughly refereed post-conference proceedings of the Second International Provenance and Annotation Workshop, IPAW 2008, held in Salt Lake City, UT, USA, in June 2007. The 14 revised full papers and 15 revised short and demo papers presented together with 2 keynote lectur...
Saved in:
Main Authors | , |
---|---|
Format | eBook |
Language | English Japanese |
Published |
Berlin, Heidelberg
Springer Berlin / Heidelberg
2008
Springer |
Edition | 1 |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
ISBN | 3540899642 9783540899648 |
Cover
Table of Contents:
- Measuring Workflow Similarity -- Clustering Algorithms -- Experimental Evaluation -- The Dataset -- Deriving Clusters -- Effectiveness of Clustering -- Workflow Representations: Graphs vs. Vectors -- Conclusion -- References -- Exploiting Provenance to Make Sense of Automated Decisions in Scientific Workflows -- Introduction -- Quality-Based Decision Processes -- Example -- Structure of the Decision Process -- Compiling Quality Processes to Workflows -- Role of Provenance -- The Quality Provenance Model -- Semantic Definition of Quality Processors -- Static Model -- Dynamic Model -- Querying the Model -- Conclusions -- References -- Using Explicit Control Processes in Distributed Workflows to Gather Provenance -- Introduction -- Provenance Gathering in Distributed Scientific Workflows -- Control Flow in Scientific Workflows -- Provenance Gathering in Heterogeneous WfMS -- Scientific Workflow Control Flows -- Control Flow Modules in VisTrails -- Execution Control on VisTrails -- Conclusion -- References -- ES3: A Demonstration of Transparent Provenance for Scientific Computation -- Introduction -- Model and Methodology -- Implementation -- Applications -- Hidden Provenance -- Nested Provenance -- Demonstration -- References -- Neuroimaging Data Provenance Using the LONI Pipeline Workflow Environment -- Introduction -- The LONI Pipeline Workflow Environment -- Goals of the LONI Pipeline Environment -- LONI Pipeline Provenance Architecture -- Data Provenance -- Processing Provenance -- Provenance Validation -- Discussion -- Conclusions -- References -- Provenance Tracking in an Earth Science Data Processing System -- Introduction -- Science Data Processing -- Data Archiving -- Primary and Secondary Metadata -- Reprocessing -- Provenance -- Scientific Reproducibility -- Process on Demand and Virtual Archives -- Provenance Problems
- Conclusion and Future Work -- References -- A Python Library for Provenance Recording and Querying -- Introduction -- Motivation -- Overview of the Python Library for Provenance -- Fundamentals of the EU Grid Provenance Concept -- General Architecture Overview -- API Description Overview -- Implementation Details -- Used Technologies and Methods -- Examples -- Current and Future Work -- Current State -- Future Work -- Conclusions -- References -- Requirements for a Provenance Visualization Component -- Introduction -- Motivation -- User Classification -- Generalized User Requirements -- Types -- Classification -- Visualization -- Visualization Examples -- Examples from Projects -- Current and Future Work -- Conclusions -- References -- Advances and Challenges for Scalable Provenance in Stream Processing Systems -- Introduction -- The TVC Model for Century and Resulting Limitations -- Challenges in the Practical Application of Model-Based Provenance -- Looking towards the Future: The CMIR Data Provenance Framework -- Challenges in CMIR-Based Provenance System Design -- Resolving Granularity Differences between Stream Data Producers and Consumers -- Granularity Resolution in Current Century Implementation -- Related Work -- Conclusions -- References -- Using Provenance to Support Real-Time Collaborative Design of Workflows -- Introduction -- Architecture -- SynchronizedDesign -- Algorithm -- Implementation -- Issues -- Discussion -- Use Cases -- Related Work -- Conclusion -- References -- Provenance in Sensornet Republishing -- Introduction -- Related Work -- Data Provenance in Sensornet Republishing -- Definition and Goals of Sensornet Provenance -- Approaches to Provenance for Sensornet Republishing -- Tracking the Transformation -- Data Disclosure for Provenance -- Implementation -- Predecessor Link -- Incremental Compression -- Evaluation
- Provenance Benefit
- Introduction -- Provenance in Kepler -- An Implementation Model -- Sub Goal 1: Data Ownership -- Sub Goal 2: Editing and Audit Trail -- Sub Goal 3: Data Annotation -- Sub Goal 4: Data Sharing -- Sub Goal 5: Data Audit and Verification -- Limitations and Conclusions -- References -- Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life -- Introduction -- The Kepler/pPOD System -- The Computation Model of Kepler/pPOD -- Recording and Representing Provenance in Kepler/pPOD -- Displaying and Browsing Provenance in Kepler/pPOD -- Conclusion and Future Work -- References -- Using Visualization Process Graphs to Improve Visualization Exploration -- Introduction -- Background -- The P-Set Model for Visualization Exploration -- Relations and Graphs for Visualization Analysis -- Visualization Process Relations -- Visualization Process Graphs -- Case Study: Improving the OASCBrowser -- Example Session and Analysis -- The Refined OASCBrowser -- Discussion -- Conclusions -- Future Work -- References -- Implementation and Evaluation of a Protocol for Recording Process Documentation in the Presence of Failures -- Introduction -- Protocol Outline -- Terminology -- Failure Assumptions -- Protocol Outline -- Implementation -- Performance Evaluation -- Throughput Experiment -- Throughput Experiment with Failures -- Benchmark Experiments -- Application Experiment -- Related Work and Conclusion -- References -- Provenance and the Price of Identity -- Introduction -- Foundations -- Current Available Strategies -- Strong Identification -- Strong Identification with IDSet -- Intermittent Identification -- Initial Identification -- Recreating Intermediate Data Items -- Evaluation -- Pros and Cons -- Time and Space -- Discussion -- Identification within an Implicit Workflow System -- Identification Across DisparateWorkflow Systems
- Intro -- Title Page -- Preface -- Organization -- Table of Contents -- Keynotes -- Provenance for Database Transformations -- Enforcing the Scientific Method -- Papers -- Mapping the NRC Dataflow Model to the Open Provenance Model -- Introduction -- The NRC Dataflow Model -- Specification of Dataflows in NRC -- Past Executions of Dataflows -- NRC Dataflow Repository Model -- Formal Definition of OPM Graphs -- Mapping NRC Dataflow Runs to OPM Graphs -- Amendment for Multiple NRC Runs -- Incorporating Runs of Subdataflows -- Adding Subvalue Provenance to an OPM Graph -- Conclusion -- References -- Data Lineage Model for Taverna Workflows with Lightweight Annotation Requirements -- Introduction -- Baseline Model for Capturing and Querying Data Lineage -- Explicit and Implicit Collections -- Data Lineage Queries -- Lightweight Annotations for Improving Lineage Data -- Discussion and Conclusions -- References -- A Logic Programming Approach to Scientific Workflow Provenance Querying -- Introduction -- Frame Logic and FLORA-2 -- Mapping Virtual Data Schema to F-Logic -- FLOQ Query Examples -- Discussions and Related Work -- Conclusions and Future Work -- References -- Recording the Context of Action for Process Documentation -- Introduction -- Background and Motivation -- Modeling the Context of a Process -- Documenting Context in Service Based Architectures -- Evaluation -- Conclusion -- References -- User-Centric Annotation Management for Biological Data -- Introduction -- The ViP Framework -- User-Centric Time Semantics -- User-Centric Network Semantics -- Implementation Highlights -- Implementation Using Views -- User-Centric Access Control -- Prototype Highlights -- User Interface -- Visualization -- Demonstration Scenarios -- Conclusions -- References -- A Model for Sharing of Confidential Provenance Information in a Query Based System
- Related Work -- Conclusions -- References -- Towards Provenance-Enabling ParaView -- Introduction -- Related Work -- A Process-Driven Provenance Model -- Change-Based Provenance -- Capturing, Representing, and Re-playing Provenance -- Capturing Actions -- Representing Actions -- Re-playing Actions -- Case Study: ParaView -- Discussion -- References -- Application of Provenance for Automated and Research Driven Workflows -- Introduction -- Use Cases -- Automated Workflow -- User-Driven Research Workflow -- Use Case Findings -- Experiences -- Conclusions -- References -- Using Provenance to Improve Workflow Design -- Introduction -- Background -- Software Reuse and Component-Based Software Development -- Component-Based Workflow -- Recommendation Systems and Collaborative Filtering -- Workflow Process Recommendation in Vistrails -- Usage Details -- Conclusion -- References -- Job Provenance - Insight into Very Large Provenance Datasets -- Introduction -- Demonstration Scenario -- Evaluated Computational Experiment -- Visual Form-The Demo GUI -- Analysis Step by Step -- Batch Job Submission -- Experiment Setup -- Job Provenance Service -- Job Implementation -- Testbed -- Related JP Extensions -- Direct JPIS Database Access -- Application-Specific JP Type Plugin -- Configuration Extensions and Database Schema Changes -- Highlights and Conclusions -- References -- A Provenance-Based Fault Tolerance Mechanism for Scientific Workflows -- Introduction and Background -- The Kepler Provenance Framework -- Classifying Data-Dependencies -- Recording Data-Dependencies -- Scientific Workflow Doctor: Using Provenance Data for Fault Tolerance -- Related Work -- Conclusions and Future Work -- References -- A First Study on Clustering Collections of Workflow Graphs -- Introduction -- Clustering Workflows -- Alternative Workflow Representations