|
XML Metadata Concept Catalog (XMC Cat) |
XMC Cat is a metadata catalog that stores rich metadata describing data objects that are themselves stored in files, storage repositories, or on the web. Its features include adaptability to domain schemata through configuration instead of code changes, support for automatic capture of metadata through the use of curation plugins, and search and browse capabilities through a web-based GUI that is dynamically generated from a domain schema. IT can be deployed in different scientific and... |
|
Data Catalog |
Data Catalog harvests data product metadata from distributed THREDDS catalogs into an XMC Cat instance. The metadata (including data product location) is then available to applications via the XMC Cat API. Data catalog metadata harvesting employs a shared nothing ingest pipeline to allow for the indexing of large catalogs such as NEXRADIII and has indexed of over 17 thousand collections and 2 million files. |
|
Karma Provenance Collection Tool |
The Karma tool is a standalone tool that can be added to existing cyberinfrastructure for purposes of collection and representation of provenance data. Karma utilizes a modular architecture that permits support for multiple instrumentation plugins that make it usable in different architectural settings. |
|
HathiTrust Research Center |
The HathiTrust Research Center (HTRC) enables computational access for nonprofit and educational users to published works in the public domain stored within the HathiTrust Digital Library , an extensive collaborative digital library of nearly 10 million volumes and 2 billion pages of archived material maintained by major research institutions and libraries worldwide. |
|
Sustainable Environment-Actionable Data |
Awarded through NSF's DataNet program, the Sustainable Environment-Actionable Data (SEAD) project will develop tools and services for active curation and longterm preservation of scientific data, while also engaging researchers through social networking tools. SEAD will enable new modalities of sustainability science -- the study of dynamic interactions between nature and society by advancing... |
|
Secure Computational and Data Environments for Non-Consumptive Research |
In this research, researchers at the University of Michigan and the Data to Insight Center are developing a “data capsule framework” that is founded on a principle of “trust but verify”. That is, the informatics scholar is given freedom to experiment with new algorithms on a huge body of copyrighted or otherwise protected information, but technological mechanisms are in place to verify compliance with the policy of non-consumptive research. |
|
Hierarchical MapReduce |
We present a hierarchical MapReduce framework that gathers computation resources from different clusters and run MapReduce jobs across them. The global controller in our framework splits the data set and dispatches them to multiple "local" MapReduce clusters, and balances the workload by assigning tasks in accordance to the capabilities of each cluster and of each node. The local results are then returned back to the global controller for global reduction. |
|
PRAGMA at IU |
IU provides a virtual cluster consisting of a frontend node and 3 compute nodes. Additional virtual clusters may be made available in the future. |
|
|
Socio-Ecological Informatics |
Social-ecological researchers study the interactions of the environment, users, and governance of environmental resources. The research undertaken by the Social Ecological Informatics group applies database and data management, information retrieval, knowledge management, human computer interaction design, and ontological tools and approaches to enhancing the value of social-ecological data for research and policy use. |
|
Streamflow |
Streamflow integrates data streams into a standard workflow system through a programming model approach that introduces new workflow semantics that enable scientific workflow designers to incorporate data streams into the experiment without major changes to the infrastructure. It utilizes XBaya as a graphical client program for workflow composition, execution and monitoring. |
|
Sigiri |
We propose a simple abstraction for interaction with heterogeneous resource managers spanning grid and cloud computing, and on features that make the tool useful for the mid-scale physical or natural scientist. Key strengths of the abstraction are its support for multiple standard job specification languages, preservation of direct user interaction with the service, removing the delay that can come through layers of services, and the predictable behaviour under heavy loads. |
|
InstantKarma |
The project improves the collection, preservation, utility and dissemination of provenance information within the NASA Earth Science community. It will customizes and integrates Karma, a proven provenance tool into NASA data production by collecting and disseminating provenance of Advanced Microwave Scanning Radiometer - Earth Observing (AMSR-E) standard data products, intially focusing on Sea Ice. The Sea Ice science team and user community are advisers and the project adheres to the Open... |
|
NetKarma |
As computer network experiments increase in complexity and size, it becomes increasingly difficult to fully understand the circumstances under which the experiment was run, particularly when these results are shared for purposes of reproducibility. The provenance of an experiment is its lineage or historical trace that can capture experiment conditions, time ordering, and relationships within the experiment and across the experiment and infrastructure layer. The GENI Provenance Registry (... |
|
Linked Environments for Atmospheric Discovery II (LEAD II) |
LEAD II is a follow-on to the successful Linked Environments for Atmospheric Discovery NSF funded large-scale ITR. LEAD II carries the vision of LEAD forward into new areas as it explores research challenges in hybrid computing and in the manipulation and use of weather data in non-weather applications. |