CTI Textual Studies Q & A

T e x t A n a l y s i s
T o o l s

Q Is there any text analysis software available for helping to determine authorship?

A There is no software, as far as I am aware, intended for determining authorship or comparing the styles of different texts. The names usually associated with computer-assisted authorship studies (A Q Morton, Anthony Kenny etc) either programmed their own software on a mainframe, used a combination of concordance software and statistical packages (for chi-squared test etc), or employed a statistician. A Q Morton, for example, enscapulates all of these.

Concerning your example I suspect that your sample might be considered too small for statistical purposes. Anthony Kenny, if I recall correctly, says more about sample sizes in The computation of style: an introduction to statistics for students of literature and humanities (Oxford, 1982) (and probably also in A stylometric study of the New Testament (Oxford, 1986)).

I would advise you to look at our summary of more general text analysis tools in case you are interested in generating frequency indexes, concordances, collocations etc. WordSmith Tools may be of some help to you since it has a facility to generate a wordlist from a corpus of texts and then to compare a wordlist from an individual text against this corpus list. The output includes a chi-squared test of significance. I do not know enough about stylistics to know whether this facility is genuinely of use for authorship studies. If you are interested in taking this further then you may wish to contact Dr David Mealand at Edinburgh who has done some detailed stylistic analysis of New Testament texts (much of which has been published in the journal Literary and Linguistic Computing). His email is David.Mealand@ed.ac.uk

Further to my response above, the enclosed announcement concerning the latest issue of Computers & the Humanities might be of general interest to you.

---------- Forwarded message ----------
Date: Wed, 1 Jan 97 16:25:38 EST
From: "Nancy M. Ide"<ide@cs.vassar.edu>
Subject: Computers and the Humanities Vol 30 No 3
Volume 30 No. 3 1996

The third number of Volume 30 (1996) of Computers and the Humanities (CHum) has just been published by Kluwer Academic Press.

This issue introduces a new feature of the journal entitled "Debates in Humanities Computing". This first debate in the series treats the controversial topic of statistical methods for authorship attribution, which has recently received unprecedented coverage in the international press: first, concerning the controversy over Richard Abrams' and Donald Foster's assertion of Shakespearean authorship of an obscure elegy, and later (and even more spectacularly), concerning Foster's subsequent attempt to identify the author of "Primary Colors" (Random House, 1996). To satisfy the obsession of the White House staff and the Washington and New York press corps to find out who wrote the book, Foster created an e-text archive of the principal candidates and used statistical methods to identify CBS correspondent Joe Klein as the author. After repeated denials on numerous international television shows and in the press, Klein finally admitted writing "Primary Colors", leading to unprecedented media interest in methods that have been a mainstay of humanities computing for decades.

The debate presented in this number of Computers and the Humanities includes an attack by Elliot and Valenza on statistical methods used in Shakepearean authorship studies, and Donald Foster's detailed rebuttal of their claims. The regular articles in the issue also report on results of computer-assisted stylistic studies.

The articles in this number of CHum are sure to fuel the continued debate over statistical methods, and is of interest to all those involved in authorship and stylistic studies as well as statistical methods for language analysis generally.

Table of Contents

DEBATES IN HUMANITIES COMPUTING: Methodology in Authorship Studies

And Then There Were None: Winnowing the Shakespeare Claimants
Ward E. Y. Elliot and Robert J. Valenza

Response to Elliot and Valenza "And Then There were None"
Donald W. Foster


Traditional and Emotional Stylometric Analysis of the Songs of Beatles Paul McCartney and John Lennon
Cynthia Whissell

Tamburlaine Stalks in Henry VI
Thomas Merriam


Computers and the Humanities
The Official Journal of The Association for Computers and the Humanities

Nancy Ide, Dept. of Computer Science, Vassar College, USA

Daniel Greenstein, Executive, Arts and Humanities Data Services, King's College, UK

For subscriptions or information, please consult http://kapis.www.wkap.nl/ or contact:

Dieke van Wijnen, Kluwer Academic Publishers, Spuiboulevard 50, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. Phone: (+31) 78 639 22 64 Fax: (+31) 78 639 22 54 E-mail: Dieke.vanWijnen@wkap.nl


Q & A

Email CTI Textual Studies

HTML Author: Sarah Porter
Document created: 27 May 1997
Document last modified:

The URL of this document is http://info.ox.ac.uk/ctitext/enquiry/tat03.html