Welcome to IU Data to Insight Wiki
Data to Insight Center Initiatives: Documentation
Provenance and Metadata
The Data to Insight Center has a strong focus in metadata and provenance. The center champions widespread preservation of scientific data, and is a strong advocate for the need for tools that can capture metadata and provenance automatically, thereby reducing manual capture.
The center supports the XMC Cat metadata catalog, a catalog capable of hosting metadata in multiple community XML schemas simultaneously. The XMC Cat web page is at project page. It was in use in the NSF Linked Environments for Atmospheric Discovery (LEAD) project (2003-2009) and is being used in the WIYN Consortium's One-Degree Imager project and the Social-Ecological Informatics project.
GENI Engineering Conference (GEC14) NetKarma Tutorial
- Provenance and Metadata Project Pages
- Provenance: Karma User Manual | Karma Visualization Manual
- XMC Cat: XMC Cat Manual | XMC Cat FAQ | XMC Cat User Manual
- XMC Cat Builder: XMC Cat Builder Manual | XMC Cat Builder Architecture
Do you have a question on XMC Cat or have you run into issues? If so, please email us at: email@example.com
This section contains links to pages for design discussions on metadata and provenance related tools developed by the Data to Insight Center, including XMC Cat and Karma.
Recently emerged hydrodynamic coastal ocean models such as the Sea, Lake and Overland Surges from Hurricanes (SLOSH) model have high resource requirements and consist of ensemble applications which require high throughput computing. Cloud is best suitable for such computationally intense burst uses of embarrassingly parallel tasks. We are currently working on using Trident Workflow Workbench and Sigiri resource-scheduler to orchestrate and execute SLOSH ensemble tasks on Azure Cloud.
Click on a project to see resources:
Preserving scientific digital data and ensuring its continued access has emerged as a major initiative for both funding agencies and academic institutions. Digital preservation includes the processes, organizations, and technologies needed to maintain scientific digital data over time.
- SEAD (pronounced seed) is a collaboration between the University of Michigan, Indiana University, and the University of Illinois. This NSF funded DataNet project will develop a set of tools that will be documented here.
Data to Insight Center Internal Wiki
- New Hire Information
- Contact Information
- Personnel Responsibilities
- Software Best Practices
- Services and Resources
- Infrastructure Information
- Internal Project Information
Helpful Hints for MediaWiki Usage
- Wiki syntax cheat-sheet can be found here ;-)
- Configuration settings list
- MediaWiki FAQ
- MediaWiki release mailing list