Indiana University

Follow us on Facebook!

HathiTrust Research Center


The HathiTrust Research Center (HTRC) enables nonprofit and educational users to have computational access to published works in the public domain stored within the HathiTrust Digital Library , an extensive collaborative digital library of nearly 10 million volumes and 2 billion pages of archived material maintained by major research institutions and libraries worldwide.

The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Repository, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.

Sloan Project for Non-consumptive Research Over the last three years (2011-2014), through generous funding from the Alfred P. Sloan Foundation, HTRC has developed the HTRC Data Capsule as a secure framework through which researchers and educators can interact with restricted content. See the project page for more details. The outcome of the HTRC Data Capsule, as per the original proposal, is a prototype system that demonstrates non-consumptive, computational access to a restricted full-text corpus. A public version of the final report is here

Jan 16, 2015 - The HathTrust Research Center v3.0 Beta Release HTRC V3.0 was released for beta testing on January 23, 2015. This release features the integration of the HTRC Data Capsule, plus a more welcoming portal, enhanced workset builder functionality and improved security features. The HTRC Data Capsule provides a secure computation and data environment for non-consumptive research. It permits analytical investigation of a corpus, e.g. copyrighted volumes, but prohibits data from leaving the capsule. The other notable enhancements for the 3.0 release include: Automatically saving jobs upon completion, Corrected use of faceted search, Single sign-on on HTRC portal (Workset Builder single sign-on is in progress), More welcoming portal UI design, User passwords must be at least 15 chars. See here for the public announcement.

March 31, 2013 - The HathiTrust Research Center (HTRC) releases new tools to mine world's largest digital repository of books. In phase two of the HTRC (September 2012-March 2013), the HTRC Technical Working Group created production versions of the beta services previewed at the 2012 UnCamp event. They are now working to open the resources to community testers who are part of the HTRC User Group Community. (For subscription details, see: HTRC-UserGroup-L)

The HTRC service stack, which provides the analytical entry point, is based on a completely new technical architecture. This framework leverages existing analytics tools such as SEASR (, digital library software such as Blacklight (, and a services-oriented architecture application interface. The current production phase includes a HTRC Sandbox that is open to scholars for evaluation of the HTRC services stack as part of their experiments.

How to Get Involved



Click here or see below.


Related News, Events and Publications:

Job opening: Postdoctoral Fellow in Data To Insight Center HTRC is looking for a postdoc in CS systems and security for Data Capsules for non-consumptive analysis.
Topic Exploration with the HTRC Data Capsule for Non-Consumptive
HathiTrust: Large-Scale Repository in the Humanities - Unlocking the Secrets of 4.6 Billion Pages
Save the date! HTRC UnCamp, March 30-31, 2015 The HTRC UnCamp is targeted to the digital humanities tool developers, researchers and librarians of HathiTrust member institutions, and graduate students. Attendees will be asked for their input in planning sessions, so please plan to register early!
(Post)humanism and Reading via Fragments: Some thoughts about the HathiTrust Research Center's New Textual Tools
IU develops Komadu, a new suite of data provenance software tools The Indiana University Data to Insight Center (D2I) has released a new suite of software tools, Komadu, designed to help researchers track and verify digital data, a crucial step in computational research.
DL: HathiTrust Research Center: Challenges and Opportunities in Big Text Data Miao Chen, D2I postdoctoral researcher gives talk during DL Brown Bag Series.
Architecture to enable large-scale computational analysis of millions of volumes The HathiTrust Research Center (HTRC) is a collaborative research center to provide Digital Humanities researchers access to not only millions of volumes from the HathiTrust (HT) digital library but also cutting-edge software tools and cyberinfrastructure to perform advanced computational analysis over the corpus at an unprecedented scale.
HathiTrust Research Center: Big Data for Digital Humanities: A Panel Discussion on Managing Big Data and Big Metadata. Building a digital library bridge to big data research is a complex proposition. Big data is an emerging research paradigm that is developing different approaches in different disciplines and domains, ranging from calls for the creation, from the ground up, of a new ‘fourth paradigm’ for science, to more immediate and pragmatic concerns with building tools and services that can be used to mine existing data sets in the sciences, social sciences, arts and humanities, and other arenas. This...
Big Data at Scale for Digital Humanities: An Architecture for the HathiTrust Research Center Big Data in the humanities is a new phenomenon that is expected to revolutionize the process of humanities research. The HathiTrust Research Center (HTRC) is a cyberinfrastructure to support humanities research on big humanities data. The HathiTrust Research Center has been designed to make the technology serve the researcher - to make the content easy to find, to make the research tools efficient and effective, to allow researchers to customize their environment, to allow researchers to...


Beth Plale [plale at indiana dot edu]


IU Contributors

  • Stacy Kowalczyk
  • Robert McDonald
  • Zong Peng
  • Robert Ping
  • Beth Plale
  • Guangchen Ruan
  • Yiming Sun
  • Felix Terkhorn
  • Aaron Todd
  • Jiaan Zeng
University of Illinois Contributors
  • Loretta Auvil
  • Boris Capitanu
  • J. Stephen Downie
  • Harriet Green
  • Kirk Hess
  • Scott Poole
  • David Tcheng
  • John Unsworth


Sponsors, Mar 2011 - present