I haven't talked at all about collection development
 
 

I. History of Prior Research

Overview
The University of Vermont continues to develop projects that explore the capture, access, navigation and use of digital facsimiles created from primary source materials. The present project builds on this prior research, as well as that of the Making of America project, the Colorado Digitization project, the many projects of the Joint Information Systems Committee of Great Britain, and others. A description of the more important prior research at UVM follows.

A. The George Perkins Marsh Research Center
United States minister to the new kingdom of Italy in 1860, lawyer, business man, scholar, language expert, and author of Man and Nature, George Perkins Marsh was one of the first to recognize and describe in detail the significance of human action in transforming the natural world. The center provides transcriptions, annotations and images of 650 selected letters in Marsh's correspondence, as well as explanatory essays. A generous grant from the Woodstock Foundation has supported this work.

B. Inventories of the University of Vermont's Manuscript Collections
A software grant from the Enigma Corporation, formerly Inso, allowed UVM to establish its first two SGML-based text encoding projects. These inventories, or Finding Aids, are EAD-encoded texts that allow for global searching of UVM's

C. The Eugenics Collection (available soon at http://www.uvm.edu/~eugenics)
Funded by a grant from The Web Project of the Vermont Institute for Science, Math and Technology, The Eugenics Collection brings together over 150 documents related to the eugenics movement in Vermont. Selected from UVM's Special Collections, and a variety of state repositories, the documents

D. Fleming Museum Collection Catalog
 

E. Experimental Electronic Text Collections (History Review, Godey, Alice B. Neal Haven)
 

II. Standards and Best Practices

To enhance interoperability with other digitization efforts, the project will use established practices and standards where they exist.
The National Digital Library Federation, a program of the Council on Library and Information Resources, has suggested three types of metadata for digital surrogate collections: intellectual, structural, and administrative. Intellectual metadata describes the content of each digital object for purposes of cataloguing. Existing standards for intellectual metadata are the library-based USMARC record and the  Encoded Archival Description (EAD) DTD for finding aids. Structural metadata is the information that describes the internal organization of the digital object for purposes of navigation. For example, in a digital facsimile of a book one might wish to go to the next page, to the next chapter, or to the table of contents. Administrative metadata records information about the digital object. This includes technical specifications related to image capture, enhancements to the digital object, information related to copyright and intellectual property rights, and information that should remain with the object to ensure its long term retention and use. Standards for structural and administrative metadata being developed by the Making of America project will be used for this project.

Although there is no one standard for capture and storage of digital surrogates, a number of best practices are being developed that balance long-term preservation needs with current technical limitations. At a minimum, this project will follow the Technical Recommendations for Digital Imaging Projects from the Image Quality Working Group of ArchivesCom, a joint Libraries/AcIS committee (http://www.columbia.edu/acis/dl/imagespec.html). These recommendations call for capturing bitonal images (printed text, line drawings) a 1-bit, 600 effective dpi to be store as uncompressed TIFF files, 8-bit greyscale for black and white photos, and 24-bit color, 300 effective dpi, for color images. The capture process will depend on the location and nature of the original. A combination of digital SLR cameras, digital video camers, and flat-bed scanners will be used.

Note: specify that we will follow AMICO (see Colorado for link)
specifications for images.

The Burlington Agenda: Research Issues in Intellectual Access to Electronically Published Historical Documents, a report on a meeting funded by the University of Vermont and the National Historical Publications and Records Commission (NHPRC), points out the limitations of today's search engines in providing intellectual access to the contents of electronically published historical documents. In an effort to address these limitations, this project will also rely on the standards developed by the Text Encoding Initiative (TEI) and Model Editions Partnership (MEP) to encode documents.

Within the context of the project itself, procedural standards will be developed for digital capture, encoding, and cataloguing to ensure that participants within UVM and partners from other institutions can create interoperable collections.
 

III. Technology Infrastructure

Creation of a secure, useable, and reliable digital surrogate archive depends on robust technology that can
   - zoo cluster
   - storage space
   - reliability (mirror, Raid5, redundant)
   - speed
   - archiving (CDRW, tape)
   - life cycle management (simple data refresh and data migration)

software:
  - continue funding maintenance of DynaText/DynaWeb for diaply/navigation
  - open source tools for integration
  - low-cost commercial products for XML/SGML tagging
  - commercial products for image manipulation and OCR
 

IV. Training

  - capture - training to standards
  - cataloguing - library standards
  - encoding - XML/SGML tools, TEI/MEP DTDs
  - K-16 use of collection: tutorials, suggestions
 

V. Methodology
 

VI. Evaluation (see MOA)
 
 - who will do it?

we need to:
 - evaluate the appropriateness of the collection as well as the structural and administrative metadat - is it valuable to students, scholars, interested others?
 - evaluate the ease of use including navigation, display, searching. "will attempt to measure not only if the MoAII Testbed
    architecture improved search capabilities and navigation options over their print counterparts, but whether they encourage new
    ways of searching and understanding the materials represented."
 - technical evaluation of the architecture. performance issues, scalability, connectivity of Z39.50/MARC/SGML
 - scalability of training: can VT be taught to grow the archive

work in (from MOA):
"For the proposed project, the IMG will employ classroom experiments, online questionnaires, and various qualitative methods to
    assess the feasibility and utility of network-based finding aids. The evaluation plan will call for data collection, interpretation, and
    reporting at three intervals: prior to implementation; at mid-project; and just before project closure. IMG's participation will include
    evaluation planning and administration, development of qualitative and quantitative measures, data gathering, analysis, and
    reporting."
 
 

Links to things mentioned in this document:

 - MOA II: planning docs, tech docs, etc.
 - Colorado: http://coloradodigital.coalliance.org/projplan.html
 - ARL DIGITAL INITIATIVES DATABASE
 - MOA Cornell: http://moa.cit.cornell.edu/MOA/
 - Columbia, assorted projects
 - Berkeley, Digital Scriptorium

 - NINCH, National Inititaive for a Networked Cultural Heritage
 - Berkeley, Digital  Images and Text link page:  includes articles and papers, companies, resources, links
 - Columbia: Technical Recommendations for Digital Imaging Projects
 - JISC/HEDS

 - TEI
 - EAD
 - MEP
 - UVM etext collections