Provenance Capture of Unmanaged Workflows with Karma with Beth PlaleAbstract: For
the digital data created as an outcome of scientific discovery to
retain its value over time, the data must undergo some level of
curation. In order for archival of scientific
data to be fully realized, however, curation costs must come down. This
will be achieved in part through tools that automate metadata and
provenance collection.
In this talk I present a logical architecture of a standalone
provenance system, and the Karma system that implements it. We focus on
the implications of unmanaged workflows particularly on the
representation of provenance information. Achieving flexible
forms provenance creation has tradeoffs in where the burden of effort
lay and in accuracy of the results. Finally,
we discuss an evaluation of the performance of Karma under two capture
scenarios and increasing workloads and determine the system to be
scalable to a mid-range workload. Bio: Beth Plale is Director of the Data to Insight Center and an Associate Professor in the School of Informatics and Computing at Indiana University Bloomington. Professor Plale did her postdoctoral work at Georgia Institute of Technology and has a Ph.D. in computer science from State University of New York Binghamton. Plale is an experimental computer scientist whose research is on data cyberinfrastructure and tools in an interdisciplinary research setting. Her research interests are in data provenance, metadata catalogs, automated digital curation, workflow systems in e-Science, and complex events processing. Plale is a recipient of the DOE Early Career award and is an ACM Senior Member and IEEE Member. (personal website) This talk was sponsored by the Data to Insight Center.
Trouble viewing? Try:
|