History of Humanities Computing: Convergence

A Hypertextual History of Humanities Computing: Convergence and Collaboration

This brings us to the last fifteen years or so, a time in which the computer has become a standard requirement in every academic's office (certainly no longer an accessory). Modern history courses tend to stop at the year in which the undergraduate students were born. I am not proposing to spend particularly long in the present, for it is here with us, today. There are so many projects proposed, underway, or completed which are in some way centred around some concept of text that it would be impossible to detail them all. As for which will prove of the greatest significance, well we shall probably have to wait for a further thirty years or more.

The 1980's is marked by the development of the personal computer, not least by IBM and Apple. The decreasing expense and the introduction of personal word processing software not only led to a larger number of scholars becoming familiar with the idea of computer assisted research but also the larger number of machines which could be purchased meant that computing classrooms could be created, not only within established computing centres but also in humanities faculties and departments. In parallel with these generic developments was the growth of computer assisted instruction. Already popular in the sciences (as predicted by Joseph Raben), this form of computing began to be used for language teaching (of which the University of Hull was an early pioneer) and subject areas within history and archaeology. In the States the University of Minnesota developed computer assisted language learning for Sumerian, Egyptian and classical Greek. Apart from work done i this area at Hull the University of Durham developed an award winning network of personal computers and software for the teaching of koine or New Testament Greek.

The late 1980's and the 1990's is also marked by the tying together of various strings or the production of standard interfaces for the performing of particular tasks. Three quite different (and yet slowly converging) strings come to mind. First is the development of standards for the encoding of electronic texts. In the 1960's and early seventies when there was talk of standards for machine-readable texts it often did not mean the actual tagging of the text but the choice of input and storage methods. That is no longer a problem but what can still be a problem is the means by which essential information is embedded in an electronic text.

The second great development of the late eighties, especially in our time, is the rapid growth of the Internet, particularly the creation, storage, and presentation of electronic texts through the World Wide Web.

Third, hypertext, multimedia and hypermedia are now common keywords. The most significant development was the introduction of hardware and software which enabled texts to be read and analysed on the screen. The creation of vast personal processing power and storage devices inevitably led to the development of texts or editions which were only ever intended to be read via a screen. The ability to not only create networks of texts but also to incorporate images, sound, and video as intrinsic elements of the text redefines the definition of an edition (and the ideal multimedia presentation would be one where no medium has greater importance over the others - that is there is no base medium such as the text to which images and sound are added on as mere additions). The Canterbury Tales project demonstrates a scholarly edition which cannot be printed in a way which preserves its form. This applies also to many CD-ROM presentations and the myriads of WWW pages.

1986 The Standard Generalized Markup Language (SGML) defined as an international standard, being allotted ISO 8879.

Hlink 1 All about SGML

1986 Oxford University Press announced that an electronic version of the new OED would be produced - 60 million words, 21,000 pages, 2.25 million citations. The final dictionary on CD-ROM was released in 1992

Hlink 2. From CD-ROM to WWW: The OED on-line

1989 Tim Berners Lee at CERN, the European Laboratory for Particle Physics (Geneva) develops the World Wide Web.

Hlink 3. The earliest Web browser at Cern (by telnet)

1989 The Computers and Manuscripts Project received funding from the Leverhulme Trust to begin work on the scholarly electronic edition of the Canterbury Tales. Todd Bender, in 1976, thought computers could be best used to preserve the creation and transmission history of a particular text, rather than perpetuating the myth (as he saw it) of a final fixed text. The Canterbury Tales Project is designed to produce more than an edition of Chaucer's work. The first part of the project which is on the verge of completion consists of the Prologue to the Wife of Bath's Tale. not one edition of the text but the images and transcription of all 55 manuscripts and the four pre-1500 printed editions. Such an edition has never been published before and this edition (if we can continue to call it that) will not see the printed page. The manuscript folios have all been digitised and transcribed. A collation of the witnesses is included but so is the Collate program itself so that, if they so wished, scholars could create their own collation. The ability to move easily between manuscripts, transcriptions, variants and so on means that any two scholars are unlikely to read it in the same way. But, this CD-ROM edition of the Prologue destabilises all other printed editions by virtue of the fact that it testifies to no single edition of the text. Anyone using this version of the Prologue will need a certain preciseness when referring to the text - one can no longer assume that it is to the Riverside Chaucer we should turn in order to check a reference.

Hlink 4. The Wife of Bath's Prologue on CD-ROM

1987 The Text Encoding Initiative established as a joint venture between the Association for Computers and the Humanities (ACH), the Association for Computational Linguistics (ACL), and the Association for Literary and Linguistic Computing (ALLC). A goal of the TEI was the production of a set of guidelines to provide a standard format for data interchange in humanities research and to suggest principles for the encoding of texts in the same format. The TEI Guidelines were, from the start, concerned both with `how' encoding should be preserved for interchange purposes and `what' should in fact be encoded. The TEI Guidelines are an application of SGML. Version one was released between March 1992 and December 1993. The latest version (P2) was issued as a two volume work in May 1994, co-edited by Lou Burnard and Michael Sperberg-McQueen.

Hlink 5. All about The Text Encoding Initiative Guidelines

1988 The Women Writers Project commenced at Brown University. Its aim to create a full-text database of women's writing in English from 1330 to 1830. The Project differed from others in its avowed attempt to make visible what had been hitherto invisible (at least in classrooms and bookshops), pre-Victorian women's writing. In this case the electronic text was not a reproduction of the printed book but rather an attempt to "substitute for the absence of books" (as Professor Kathryn Sutherland, director of Project Electra, put it). Like its 'daughter', Project Electra, the texts themselves are encoded using SGML markup tags, following the guidelines of the Text Encoding Initiative.

Hlink 6. Women’s Writers Project at Brown University and Project Electra: women's writings in English from the period 1780-1830

January 1991 British National Corpus project commenced. Aim to encode 4124 modern British English texts of all kinds (including transcribed interviews) which amounts to over 100 million words encoded in SGML. The BNC t was produced by an consortium of leading dictionary publishers (OUP, Longman, Chambers-Harrap) and academic research centres (Oxford University Computing Services, Unit for Computer Research in the English Language at Lancaster University, British Library Research and Development). The first release of the corpus on CD-ROM was announced in February 1995.

Hlink 7. The British National Corpus

1993 The National Centre for Supercomputing Applications release Mosaic, a browser for the World Wide Web. Mosaic was most closely identified with the WWW and its popularity as a gateway to the Internet rocketed (particularly when a Windows version was released). The process by which people began to identify the Mosaic with the Internet had commenced...

June 1993 Chadwyck-Healey announce the release of their long awaited Patrologia Latina Database. The price is set at Ł27,000 making it not only the largest full text database but also one of the most expensive. The PLD comprises the 221 volumes of Migne’s Patrologia Latina including annotations, commentaries and images. The PL has been placed on CD-ROM "as is" including Migne’s errors, evoking the criticism from Paul Tombeur (CETEDOC) that we will see the errors and wrong attributions which scholars have devoted much time to correcting, once more re-appear in scholarly works by those who see the PLD as the corpus of Patristic and Medieval literature.

Hlink 8. Chadwyck-Healey and all their works

The most recently created applications I have demonstrated have been hypertext or multimedia presentations. One of George Landow's recent works suggests that hypertext marks the convergence of literary theory and electronic texts. Certainly there is much talk still of hypertext creating an author out of every reader. The author is dead, long live the author where the living author is she who was assumed to be the reader. But, one may ask, are these new authors merely pretenders to the throne? Behind every hypertext system, every multimedia CD-ROM, any computer-based edition which presents the user with choices there is an original creator, a first mover. One can compare some of the current theories to the postulation of a deity which created the world, created humans, and gave us all free-will. We only think we have complete freedom to roam but forget the origins of that free will and the origins of the environment in which we wander. Thus, the new reader may well think she can freely choose where to go today but one may well ask who gave her the choices in the first place? Who created the possibilities. The hypertext system does not kill authors but merely removes them to one further step away from their creations. Possibly.

August 27, 1994. The Economist reports that "Databases are transforming scholarship in the most conservative corners of the academy, forcing technological choices even on to the humanities".

A similar headline could have appeared in 1974 or even 1964.

[Introduction] | [Pioneers] | [Independence] | [Convergence]

This document created: 4 February 1996
This document last revised: 4 February 1996
Author: Michael Fraser
The URL of this document is http://info.ox.ac.uk/ctitext/history/converge.html