Main Page

From D2I Wiki
Jump to: navigation, search

Welcome to IU Data to Insight Center Wiki


Data to Insight Center Initiatives

Provenance and Metadata

The Data to Insight Center has a strong focus in metadata and provenance. The center champions widespread preservation of scientific data, and is a strong advocate for the need for tools that can capture metadata and provenance automatically, thereby reducing manual capture.

The center supports the XMC Cat metadata catalog, a catalog capable of hosting metadata in multiple community XML schemas simultaneously. The XMC Cat web page is at project page. It was in use in the NSF Linked Environments for Atmospheric Discovery (LEAD) project (2003-2009) and is being used in the WIYN Consortium's One-Degree Imager project and the Social-Ecological Informatics project.

GENI Engineering Conference (GEC14) NetKarma Tutorial


Do you have a question on XMC Cat or have you run into issues? If so, please email us at:

Design Discussions

This section contains links to pages for design discussions on metadata and provenance related tools developed by the Data to Insight Center, including XMC Cat and Karma.

XMC Cat Design Discussions

Karma Design Discussions

Cloud Computing

Recently emerged hydrodynamic coastal ocean models such as the Sea, Lake and Overland Surges from Hurricanes (SLOSH) model have high resource requirements and consist of ensemble applications which require high throughput computing. Cloud is best suitable for such computationally intense burst uses of embarrassingly parallel tasks. We are currently working on using Trident Workflow Workbench and Sigiri resource-scheduler to orchestrate and execute SLOSH ensemble tasks on Azure Cloud.

Click on a project to see resources:


Curating and preserving scientific data and ensuring its continued access has emerged as a major initiative for both funding agencies and academic institutions. Digital curation includes the processes, organizations, and technologies needed to maintain scientific data and add value to it over time.

  • SEAD (Sustainable Environment - Actionable Data, pronounced "seed") is a collaboration between the University of Michigan, Indiana University, and the University of Illinois.

For more information see Project webpage

Socio-Ecological Informatics

blurb here

Additional resources

Computational Humanities

Hathitrust Research Center

Additional resources

Data to Insight Center Internal Wiki

The D2I Internal wiki contains detailed information about personnel and projects, services and resources, and software practices. It is intended for internal use, so additional log in is required.

[Click here to access it.]

Other Quick Links

Helpful Hints for MediaWiki Usage

Personal tools