The End of Term Web Archive: Collecting & Preserving the .gov Information Sphere

Streaming Media



View Presentation (66.1 MB)

Download Text Transcript (48 KB)

Download MP3 Audio File (29.0 MB)


In the fall of 2016, a group of institutions – Internet Archive, Library of Congress, CA Digital Library, and libraries from the University of North Texas, Stanford University, and George Washington University – organized to preserve a snapshot of the federal government website. This is the third time this End of Term (EOT) group has organized with the goals of identifying, harvesting, preserving, and providing access to a snapshot of the federal government web presence. They do this for two important reasons. The first is that the transition of elected officials in the federal government’s executive branch prompts a reset of sites like, so it’s critical to document the changes. The EOT group’s work also provides a broad snapshot of the federal domain once every four years; it’s replicated among a number of organizations for long-term preservation.

Jefferson Bailey from the Internet Archive and James Jacobs from Stanford University Libraries discussed the project’s methods for identifying and selecting in-scope content, strategies for capturing web content, and access models for collected content. The two highlighted the challenges and opportunities of large-scale, distributed, multi-institutional, born-digital collecting and preservation efforts; how the project aligns with participant institutions collection mandates; the project’s importance for archiving historically-valuable but highly-ephemeral web content without a clear steward; and how the breadth and size of the EOT Web Archive informs both new methods of collaboration and new models for data-driven access and analysis by researchers. Our speakers also discussed the project’s alliance with other government data preservation projects as well as ideas and future plans for long-term sustainable methods for collecting, preserving and maintaining the .gov information ecosystem.

Publication Date



American Politics | Archival Science | Collection Development and Management | Policy Design, Analysis, and Evaluation

The End of Term Web Archive: Collecting & Preserving the .gov Information Sphere