|Computers & Texts No.
14||Table of Contents||April 1997|
University of Oxford
Chadwyck-Healey are well known for their series of large textual databases on CD-ROM, especially the four CD set of English Poetry from 600-1900 CE. They have now made available on a World Wide Web site a number of these databases, on a subscription basis. In addition, they offer a range of other resources, and links to other Internet sites. Subscription is by IP address or range of addresses (such as an entire university domain like ox.ac.uk): access is then transparent to any users within the registered domain, and the pages appear as just another web site, accessible from any platform with a graphical browser.
The databases that are available at present (Chadwyck-Healey state that they intend to add material to the site as it becomes available) are English Poetry, English Drama (combining the English Verse Drama and English Prose Drama CD-ROMS), Eighteenth Century Fiction, American Poetry, and African-American Poetry. Additions scheduled for 1997 include Early English Prose Fiction, Editions and Adaptations of Shakespeare, and The Bible in English (The King James - 'Authorised Version' is available in the Reference section, see below). It is possible to subscribe to some or all of these: in the latter case, one great advantage over the CD-ROM editions is that it is possible to search over all the databases at once. A search, for instance, on all the databases currently available for 'ecstasy' produced a tally of 496 hits (183 entries) in American Poetry, 1264 hits (436 entries) in English Poetry, 21 hits (9 entries) in African-American Poetry, 210 hits (134 entries) in English Drama, and 34 hits (12 entries) in Eighteenth Century Fiction. Searching speed is obviously dependent on the load, on the Internet in general as well as on the Chadwyck-Healey site, but generally seems quicker than the networked CD-ROMS available in Oxford: and it is much more convenient to be able to use a general purpose browser than special software. Access to the texts is a little long-winded: first comes a 'Summary of Matches' in each database, then clicking on one of those gives author and title details for the 'hits'. Clicking on one of those produces a 'Context of Matches' page giving ten words of context each side of the search term: that leads to a 'Full Text' page offering a choice of 'Contextual Table of Contents', 'Text Only', and 'First Hit'; and finally clicking on the last of these takes one to the text at the first occurrence of the search term. It would be nice to have the opportunity to go straight to the hit in its full context from the author/title details: the procedure adopted is presumably designed to minimise downloads, as the texts are split into quite sizeable chunks. Performance at the moment is however good: whether it would remain so if take-up was heavy is uncertain, but Chadwyck-Healey seem determined to keep the site accessible.
As well as access to the CD-ROM material, the Literature Online (henceforth LION) site offers a variety of 'added value'. There is a 'Master Index' listing not only the texts in their own databases but also other sources on the Internet. A search in this on 'Wordsworth', for instance, threw up 1956 texts: although a considerable number of these citations are to the English Poetry database, there was a substantial set of extra references to such sources as the Bartleby Library at Columbia (http://www.columbia.edu/acis/bartleby) and 'Steve's Poem Page' (http://www.crocker.com/~slinberg/poems). One use of this should be to get access to texts by 20th century authors not included in Chadwyck-Healey's own databases, but whether by design (worries over copyright?) or accident, the index is not yet very helpful in this area. A search on 'Yeats' produced no hits, whereas an AltaVista search on 'W. B. Yeats' produced about 400 hits, many of which included full texts of poems (e.g. http://www.maths.tcd.ie/pub/yeats/Index.html). Similarly potentially very useful but as yet not fully developed is the list of 'Further Web Resources', a database of discussion lists and groups, author pages, research resources, and journals. Yeats fares better here, but in the 'author pages' section only two are mentioned, in Sligo and Japan: I should have thought that, for example, the Yeats Society of New York page would be worth a mention (http://www.panix.com/~wlinden/yeats.shtml). I feel that this section should really be made available for free as a service to the general public, rather than included in the subscription service: the public would then feel more inclined to help keep the lists up-to-date, and it would make good commercial sense to attract people to the web site. The model here is perhaps the one sector of the Internet economy which has cracked all the commercial problems, the provision of pictures of people without any clothes on. Typically Internet porn sites offer some free 'samples' and links alongside subscription services for the panting customer: Chadwyck-Healey might consider following in their footsteps.
Finally, there is a set of 'reference works' which is potentially one of the most useful features of the site. At the moment these include Webster's Third New International Dictionary, 1993 edition, The King James - 'Authorised Version' of the bible, the Bibliography of American Literature, a record of primary bibliography containing 'nearly 40,000 records of the literary works of approximately 300 American writers from the period of the Revolution to 1930' and the Periodical Contents Index: Literature, 'an index to over 800,000 periodical articles in the field of literature'. It is planned to add to these this year, however, The Annual Bibliography of English Language and Literature (ABELL), which will be a major asset. The data from 1980-1994 will be the first available (from May 1997), and ultimately all records going back to 1920 will be online, with a choice of subscription levels depending on the starting date (e.g. for 10-15 concurrent users the subscription - without any discounts - will be £1250 for 1980+, £2500 for 1960+, and £3750 for 1920+). Parts of ABELL, which is published by the Modern Humanities Research Association, have been available online at Cambridge by a telnet connection, and the question will naturally be raised as to whether a project like this should be placed in the hands of a commercial publisher or should have been mounted by a UK academic institution (as the telnet service was by Cambridge). My own feeling is that provided Chadwyck-Healey do a good job on digitising and maintaining the data, there is no reason why the MHRA should not go to a publisher in this way.
Although I have some reservations about the user-interface, there is no denying that LION is impressive in what it provides, and will be even more so when ABELL is fully online. Institutions will have to look carefully at the pricing of the various services, particularly if they have some of the CD-ROMS already: it is to be hoped that some kind of CHEST deal might be possible for UK institutions. Chadwyck-Healey's products are expensive for small users, but much more economical for institutions: and the databases have interest for more than just English specialists (experto crede). So a broad welcome for the new initiative, but take a good look at the figures involved for full access!
[Table of Contents] [Letter to the Editor]
Computers & Texts 14 (1997), 18. Not to be republished in any form
without the author's permission.
HTML Author: Michael Fraser (firstname.lastname@example.org)
Document Created: 24 May 1997
The URL of this document is http://info.ox.ac.uk/ctitext/publish/comtxt/ct14/fowler.html