Karma Provenance Collection Tool
Overview Provenance (or lineage, trace) of digital scientific data is a critical component to broadening sharing and reuse of scientific data. Provenance captures the information needed to attribute ownership and determine, among other things, the quality of a particular data set. Provenance collection is often a tightly coupled part of a cyberinfrastructure system, but is better served as a standalone tool. The Karma tool is a standalone tool that can be added to existing cyberinfrastructure for purposes of collection and representation of provenance data. Karma utilizes a modular architecture that permits support for multiple instrumentation plugins that make it usable in different architectural settings. Visualization of provenance data is more useful with support for manipulating very large structures, for displaying different views and for interactivity. This can help a user to navigate their experiment information with a mental map of what is going on in the experiment, to compare different experiment runs quantitatively, and to do model selection with an effective collaboration between the user and the discovery system. We developed two plugins to Cytoscape to aid the visual representation and navigation of provenace information.
The Karma Provenance Tool is licensed under Apache License, Version 2.0 (the "License") (http://www.apache.org/licenses/LICENSE-2.0). The code is copyrighted and copyright owned by The Trustees of Indiana University. Karma is a product of the Data to Insight Center of Pervasive Technology Institue (http://pti.iu.edu) at Indiana University. See Digital Data Provenance for more information. Features of Latest Release (v3.2.1) - Improvement of query performance with provenance graphs caching.
- Implementation of several query API calls.
- Mutiple bug fixes.
Contact Us Downloads for v3.2.1 Previous releases are here Publications
- Bin Cao, Beth Plale, Girish Subramanian, Ed Robertson, Yogesh
Simmhan, Provenance Information Model of Karma Version 3, IEEE 2009 Third International Workshop on Scientific Workflows
(SWF'09), July 2009.
- Bin Cao, Girish Subramanian, Beth Plale, Poster: Provenance
Collection in a Industry Biochemical Discovery Cyberinfrastructure,
IEEE e-Science, Indianapolis, IN, December 2008.
- The Open Provenance Model (v1.01). Moreau, L. (Editor), B. Plale, S.
Miles, C. Goble, P. Missier, R. Barga, Y. Simmhan, J. Futrelle, R.
McGrath, J. Myers, P. Paulson, S. Bowers, B. Ludaescher, N.
Kwasnikowska, J. Van den Bussche, T. Ellkvist, J. Frieire, P. Groth,
Technical Report, Electronics and Computer Science, University of
Southampton, 2008. http://eprints.ecs.soton.ac.uk/16148
- Yogesh L. Simmhan, Beth Plale, Dennis Gannon, Query Capabilities of the Karma Provenance Framework, Concurrency and Computation: Practice and Experience, Vol 20, Issue 5, pp. 441-451, John Wiley and Sons, 2008.
- Yogesh Simmhan, Beth Plale, and Dennis Gannon, Karma2: Provenance Management for Data Driven Workflows, Extended and invited from ICWS 2006. International Journal of Web Services Research, IGI Publishing, Vol 5, No 2, 2008.
- Yogesh Simmhan, Beth Plale, Dennis Gannon, Towards a Quality Model for Effective Data Selection in Collaboratories,
IEEE Workshop on Workflow and Data Flow for Scientific Applications
(SciFlow06), held in conjunction with ICDE, Atlanta, GA, April 2006.[Slides]
- Yogesh Simmhan, Beth Plale, Dennis Gannon, A Performance Evaluation of the Karma Provenance Framework for Scientific Workflows,
International Provenance and Annotation Workshop (IPAW'06), Lecture
Notes in Computer Science 4145, L. Moreau and I Foster (Eds),
Springer-Verlag, Berlin Heidelberg pp. 222-236, 2006. [Slides]
- Yogesh Simmhan, Beth Plale, and Dennis Gannon, A Framework for Collecting Provenance in Data-Centric Scientific Workflows, Proceedings of the IEEE International Conference on Web Services pp. 427-436, 2006.
- Yogesh L. Simmhan, Beth Plale, and Dennis Gannon, A Survey of Data Provenance in e-Science, ACM SIGMOD Record, Vol. 34, No. 3, September 2005.
- Yogesh L. Simmhan, Beth Plale, and Dennis Gannon, A Survey of Data
Provenance Techniques, Technical Report TR-618, Computer Science
Department, Indiana University, Bloomington, 2005.
Contact - Beth Plale [plale at indiana dot edu]
- Yiming Sun [yimsun at indiana dot edu]
Project Contributors Current:
- Beth Plale, Project Director
- Scott Jensen, Senior Researcher
- You-Wei Cheah
- Peng Chen
- Devarshi Ghoshal
- Yuan Luo
Historical:
- Yiming Sun, Senior Software Developer
- Mehmet Aktas, Associated Faculty
- Bin Cao
- Dennis Gannon
- Prajakta Purohit
- Ed Robertson
- Yogesh Simmhan
- Girish Subramanian
Digital Data Provenance >>
|