Trip Report, Hope Greenberg to:

Electronic Book Technologies, Inc.
Sept. 23-27, 1996, Providence, RI

Three years ago I attended a conference in Georgetown U. hosted by the Association for Computers and the Humanities (ACH) and the Association for Literary and Linguistic Computing (ALLC). One year later I attended a conference in Washington, D.C. hosted by the Association of University Presses. Each of these conferences introduced something that would eventually lead to Paul Philbin, Wiz Dow and I spending a week in a classroom in Providence.

At ACH-ALLC the big news was that the second draft of the massive TEI DTD had begun. The TEI or Text Encoding Initiative is "an international project to develop guidelines for the preparation and interchange of electronic texts for scholarly research." And the TEI DTD is the SGML Document Type Definition that has resulted from that group. That is, it is an evolving suite of SGML tags (currently the print documentation is up to 1300 pages) designed to mark up and allow the rendering of texts: books, poems, playscripts, filmscripts, broadsides--any printed material that might be of interest to scholars in any language. At the University Press conference, representives from Berkely demonstrated the Encoded Archival Description project, or EAD DTD. This DTD is designed for the mark up of Finding Aids. In most libraries, Finding Aids are a set of notebooks that detail the items in Special Collections, for example, it might list "one box of items from Dorothy Canfield Fisher containing 20 letters, a scarf, two pair of shoes" etc. For the most part, these Finding Aids have not made it into the online world, making searching and finding this kind of resource an arduous or "hear about it through the grapevine" kind of task. The EAD DTD project has now been acquired by the Library of Congress.

Both these DTDs continued to develop, despite the fact that robust, cheap SGML tools for editing, parsing, and publishing SGML did not exist. That is still primarily the case but several months ago a possibility arose. Electronic Book Technologies, or EBT, an electronic publishing company, made available a grant that would provide its suite of SGML tools (street price: $180,000+) to organizations in exchange for the cost of training on those tools. Quite a few universities have jumped at the chance and, after some discussion with the library folks and the OK from Norman, we submitted our grant proposal.

Summary of the Training: There are several ways to use the products. In one model, documents are created in your favorite word processor, with the extensive use of styles. The document is then run through a product that allows you to map the styles to SGML tags. It generates an SGML-marked up document and a DTD that matches it. You then open this document in a style sheet editor and decide how you want various versions to be formatted and viewed. For example, you might have a view with lots of formatting designed for a stand alone version of the book. Or you might have a view that "dumbs down" the document for use on the web, or a view that only contains the Table of Contents, or specific searches etc. You then publish the book and, if you want it to be readable on the web, run it through another product named DynaWeb.

The book is then ready to be read in its full version by the DynaText browser, or, if it has been prepped for the web, in its web version by any web browser.

This model, where a document carries its own custom DTD with it and the browser needs to interpret the document based on the DTD, is quite different from the web model where the DTD is hard coded into the browser and all documents must conform to that DTD to be read--hence the "view this with Browser X only" notes on so many web pages.

This method is also good for people who do not want to know about the SGML behind the book or who are not trying to adhere to a particular DTD. In our case we had two specific DTDs that we wanted to work with so the process was a bit different. We defined a suite of styles based on our DTD or tagged a document by hand, then ran it through a DOS program (MKBOOK) to parse it and verify the SGML. The end product could then be run through the style sheet editor and DynaWeb. With a DTD like the TEI that has hundreds of tags this can be a laborious process so several people, myself included, are working on word processor style templates to ease the process. Also a certain amount of this can be automated with macros if documents are created consistently.

Comments on the product: Designed before the web, EAB's products are definitely geared to an electronic "book," that is, they are designed to create a unit that is fairly heirarchical (though hypertext is a large part of it), has a Table of Contents, divisions (like chapters, sections etc.) and pagination. However, hypermedia documents are expected and, of course searching by more than just simple keywords is also a strength. For an example, just look at the documentation at the EBT web sight which was created with the Dynatext/DynaWeb products.

Now that SGML tools are beginning to make their way into the larger market we may find that we do not need the entire suite. For example, we have 10 copies of the Dynatext browser (both Mac and Windows). But SoftQuad's browser, Panorama, looks like it could do that part of the job as well. We'll have to continue investigating these tools.

Futures: Our plan so far, as outlined in our grant proposal to EBT, is to have a Finding Aid for the George Perkins Marsh papers up witin six months. Also, we will have some primary resources from Special Collections online and a Master's Thesis (guess whose) up as well.

In an amazing coincidence Paul, Wiz and I had been talking about the potential for electronic archiving of theses and dissertations. I came back to find that Geoff had been speaking with the Grad College about e-theses as well. So we are getting the concerned parties together to pursue this. Based on a perusal of the web, this is a wide open area where UVM could take a lead. Virginia Tech has received funding from various sources, including University Microfilms, to develop a PDF to microfilm model but that's about it.

But that's the next step. Right now we have to learn the intricacies of this thing, figure out how to dessimnate and support it if it seems like it can work, and then look at the possibilities.

Questions, comments, and suggestions are, as always, welcome!

Trip Report: EBT, Inc. Providence, RI, 9/23-27,1996., Computing and Information Technology, University of Vermont, 10 Oct 1996.