ICAME Collection of English Language Corpora

Description: The ICAME collection contains a number of corpora on a single CD-ROM. Most are available in TACT, WordCruncher and text format. The CD-ROM includes:

Brown Corpus - one million running words of written English, divided into categories such as journalism, fiction, non-fiction, miscellaneous; in total, there are five hundred samples of around two thousand words, compiled at Brown University during 1963-4;

LOB corpus - complied at the University of Oslo and University of Lancaster during 1970-76, which consists of one million running words or British English, with five hundred samples of around two thousand words;

London-Lund corpus - a half million words of spoken British English, compiled at Lund University using spoken material from Category II of the Survey for English Usage Corpus. The Corpus includes prosodically transcribed broadcast and recorded material (including dialogue, telephone conversations, and oration);

Helsinki corpus - 1.5 million words of written English of the period 850-1720, released by the University of Helsinki in 1991. Text types include sermons, diaries, plays, correspondence, legal, and scientific documents;

Kolhapur Corpus - one million running words of written Indian English, compiled at Scivaji University, Kolhapur, during 1980-86; five hundred texts are divided into categories for newspaper, fiction, non-fiction, miscellaneous.

The corpora are also available separately. See the ICAME web page for more information.

Requirements: PC with CD-ROM drive, or Apple Macintosh with CD-ROM drive, or UNIX system with CD-ROM drive.

Further information: from the ICAME web page at http://www.hd.uib.no/corpora.html

Distributor: ICAME, The Norwegian Computing Centre for the Humanities, Harald Haarfagresgt. 31, N-5007 Bergen, Norway. Tel. +47 55 58 29 54/5/6. Fax. +47 58 94 70. Email: icame@hd.uib.no

Price: 3000 NOK (approx. £250).

Version available at CTI Centre: Current.

