D2I: FRIEDA: Flexible Robust Intelligent Elastic Data Management in Cloud Environments

Wednesday, February 6, 2013 - 4:00pm - 5:00pm

Informatics West 107, 919 E 10th Street, Bloomington, IN

Devarshi Ghoshal, Doctoral Student, Data to Insight Center, Indiana University

Devarshi Ghoshal

Abstract: Scientific applications are increasingly using cloud resources for their data analysis workflows. However, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance-cost trade-offs, complex application choices, complexity associated with elasticity and, failure rates. The explosion in scientific data coupled with unique characteristics of cloud environments require a more flexible and robust distributed data management solution than the ones currently in existence. This paper describes the design and implementation of FRIEDA - a Flexible Robust Intelligent Elastic Data Management framework. FRIEDA coordinates data in a transient cloud environment taking into account specific application characteristics. Additionally, we describe a range of data management strategies and show the benefit of flexible data management schemes in cloud environments. We study two distinct scientific applications from bioinformatics and image analysis to understand the effectiveness of such a framework.

Bio: Devarshi Ghoshal is a fourth year PhD candidate in the School of Informatics and Computing at Indiana University, Bloomington. He is doing his PhD in Distributed Systems with a minor in Programming Languages. His current work focuses on building tools for collecting Provenance on distributed environments. He recently did his internship in Lawrence Berkeley National Laboratory in the Computational Research Department where he developed a framework for managing data and processes over cloud resources.

