An Encoding Model for Genetic Editions

[Page]

About this Document

This document describes a draft encoding model for Genetic Editions and Genetic Editing. The document is the product of a Workgroup on Genetic Editions (chair: Fotis Jannidis), which is part of the TEI MS SIG (chairs: Elena Pierazzo, Malte Rehbein, Amanda Galley).

The workgroup's goal was to develop an Application Profile for the encoding of genetic editions and, in general, genetic phenomena. It is expressed as a TEI P5 conformant customization, integrating material from the existing TEI Guidelines, chiefly Chapter 11. Representation of Primary Sources and Chapter 12. Critical Apparatus, together with additional new material. It may eventually, at the end of the process described in the following section, constitute a self standing new Guidelines chapter, or remain a set of recommendations for how to customize the Guidelines, but that is a decision for the TEI Council.

The document reflects discussions held at a number of different meetings:

The work group was initially inspired by the HNML. HyperNietzsche Markup Language and following versions (GML Genetic Markup Language) produced by Paolo D'Iorio and colleagues from the HyperNietzsche project. We would like to thank Paolo D'Iorio for his invaluable contribution in the early stages of the work.

This version of the document has been extensively revised by Elena Pierazzo and Lou Burnard, for presentation at a panel to be held at the Annual TEI Members Meeting in November 2009.

Work Plan

The planned evolution of this document and the encoding model it describes may be summarized as follows:

This draft document is publicly available for discussion and feedback from the community. The document source is maintained in the TEI subversion repository at http://tei.svn.sourceforge.net/viewvc/tei/trunk/genetic/ ; information about the development of the proposals and associated materials, including complete drafts of this document, are hosted on the TEI Wiki at http://wiki.tei-c.org/index.php/Category:Genetic_Editions .

Conventions used

Although the entire document is a draft and therefore susceptible of changes, some sections are less stable than others. In particular, when a section or a particular element requires further discussion or is considered an open problem, such a section or element is marked by a * mark.

As required for TEI conformance, non-TEI elements are defined in a distinct non-TEI namespace. In the usage examples and throughout this document that namespace is mapped to the prefix ge:, while TEI elements are not marked by any namespace prefix.

1 Theoretical Framework

The genetic approach differs from other approaches to the study of texts because it aims not only to identify ‘what is on the page’, but also to reconstruct the process necessary to produce ‘what is on the page’.

The encoding model for Genetic Editing must therefore handle:- A Genetic edition may be prepared by producing a full transcription of all extant witnesses, or by combining a full transcription of only one document (a base-text) with information derived from other witnesses by means of automatic collation.

Because our model aims to be independent of presuppositions associated with any particular theoretical framework, we begin by reviewing some typical dichotomies in editorial theory.

1.1 Fact vs. Interpretation

In German editorial theory there is a well known opposition between what is there in the source document, the record (Befund), and the interpretation of this phenomenon (Deutung). This opposition implies that there is a way to talk about the record without any interpretation. Yet at some possibly simplistic level, everything we say about a text is based on interpretation, particularly in the realm of genetic criticism. 1 At the same time, there is an obvious difference between the interpretation that some trace of ink is indeed a specific letter and the assumption that a change in one line of a manuscript must have been made at the same time as a change in another line because their effects are textually related (for example, the first change was to a rhyming word, which necessitated the second change). Therefore we propose to talk about differing levels of interpretation, thus differentiating between ‘what’s there’ (document/fact) and ‘how does it relate’ (text/interpretation).

1.2 Document vs. Text

In Manuscript Studies (Editing, Codicology, Palaeography, Art History, History) the first level of enquiry is always the document, the physical support that lies in front of the scholar’s eyes.

To understand the text that is contained in the manuscript, a deep study of the manuscript itself is fundamental: the layout, the type of script, the type of writing support, the binding and many other aspects are able to tell us about when, where, and why this particular text was composed. The text therefore represents a different level of enquiry: it is a construct, derived from the reading of the documents.

In the case of modern draft manuscripts scholars must give detailed consideration to the layout, the different stratifications of writing and the disposition of these in the physical space; all of these, together with an understanding of the text, are required to gain insight about the composition, time of revisions, and flow (flux) of the text. Furthermore, in some cases, we know that the kind of physical support used to record it not only influences but may also actually determine the text itself. For instance, the content and the length of letters are often determined by the size and quantity of the paper available to the writer; even more so for items such as postcards.

The TEI has traditionally prioritised the text level. Of the two possible views available to someone transcribing a primary source (text and document), the TEI privileges the text (hence Text Encoding Initiative). Such physical or topographical information as a typical TEI encoding provides is subordinate to the main structural encoding, whether because it is represented by empty elements (<pb/>, <lb/>, <cb/>) or attributes (<add place="">, <note place="">, or rend). The TEI thus reflects the not uncommon view that, while relevant, documents are somehow less relevant than the texts they embody; to use a bibliographical metaphor, texts are ‘substantial’ while documents are ‘accidental’.

However, for genetic editions a focus on the document is crucial. In many cases, the only way to reconstruct the process of writing and re-writing which leads to a new text is to examine a specific document. We therefore propose to complement the existing text-focussed approach with a new encoding scheme focussed instead on the document.

We should then clarify the way we will use the following words:

2 Aspects of Genetic Editions

Modern genetic editions encode the genetic process within one manuscript and over the course of two or more manuscripts; in this latter case quite often they also offer a view of each of the manuscripts as a single self-contained object. This is because the manuscript view provides the material basis for the relationships established by the inter-manuscript relationship. Therefore we propose to differentiate between the following aspects of a genetic edition:

3 The document level

3.1 Transcription of a document

A document-based transcription is hierarchically organised in the following way:
  • document
    • Writing Surface (page, double page, folium, etc.)
      • zone
        • Line (or p or other block elements)
We propose to introduce a new element, <ge:document> to encode a document-based transcription, at the same level as the existing TEI text element. A full TEI document may thus comprise:
  • a TEI Header, containing metadata
  • a TEI facsimile element, containing and describing visual representations of a document
  • a <ge:document> element, containing a genetic transcription of a document
  • a TEI text element, containing an encoded version of the text constructed from the document.
The header and at least one of the other three components must be present. We do not discuss facsimile or text elements here.
In the simplest case, a document contains one or more written surfaces, of various types (pages, for example). Each surface may contain one or more distinct written zones of writing, each comprising one or more identifiable topographic lines. The following elements are used to represent these basic components:
  • document contains a document-centric transcription of a primary source, providing topographical information as well as transcription
  • surface defines a written surface in terms of a rectangular coordinate space, optionally grouping one or more graphic representations of that space, and rectangular zones of interest within it.
    type characterizes the element in some sense, using any convenient classification scheme or typology.
  • zone defines a rectangular area contained within a surface element.
    rotate indicates the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent surface element as implied by the dimensions given in the msDesc section or by the coordinates of the surface itself. The orientation is expressed in arc degrees.
    stage points to a <stageNote> which contains a description of a text-stage to which the editors think the alteration marked by the element bearing this attribute (and its children) belongs.
  • line contains the transcription of a topographic line in the source document

Like a facsimile, a <ge:document> contains information about the written surfaces constituting a document. Because of this similarity, we would like to use the same elements (surface and zone) as proposed in the existing TEI scheme, although these place limits on what can be described. Specifically, the zone element as currently defined can represent only a rectangular area; it also lacks any way of stating the baseline applicable to any writing contained within it

The size of the writing surface is defined by a set of cartesian coordinates measured from the top left corner. The co-ordinates of all zones identified within the writing surface are given in terms of the same co-ordinates, as further discussed in the TEI proposals for facsimile. It will often be the case that explicit dimensions for a manuscript page (expressed in mm for example) are also supplied in a msDesc element in the TEI Header, but this is not a requirement; in particular there is no assumption that the co-ordinate system defined by a surface maps to any particular external dimensions, nor that the co-ordinate systems of different documents necessarily correspond.

A surface element may contain any number of zone graphic or line elements. The graphic element is used to point to any graphic (non textual) component forming part of the page, in the usual TEI manner. The zone element is used to delimit any contiguous section of writing which the encoder wishes to identify for some purpose.

Zones can be nested and grouped, and can also overlap. Their positioning with respect to the surface element is defined by coordinate values taken from the same co-ordinate system as the surface itself, measured from the top left corner. The element carries a rotate attribute which describes (in degrees) the orientation of the surface with respect to the content (writing, images) in that zone, with respect to its normal orientation. Note that the mechanism aims to describes the process by which the content of a specific zone has been supplied (i.e. the author has physically rotated the writing surface) rather than the orientation of the writing.

Zones are arbitrarily defined by the encoder according to the layout of the writing surface and can make use of a standardised vocabulary (e.g. the top margin).

To overcome the inherent limitations of using the existing zone and surface elements, we propose to extend their capability to include the definition of arbitrary polygons and baselines, probably by embedding appropriate elements from the Standard Vector Graphics (SVG) XML namespace. This work is not yet complete however.

The attribute stage is used to indicate the stage in a writing campaign to which this zone has been assigned by the encoder, as further discussed in 3.3 Revision campaigns below.

Within a zone, individual lines of writing are usually distinguished using the <ge:line> element.

In the following imaginary example, there are two main areas of writing, the diary entry (black ink) and another (supposedly later) annotation in blue ink.
Note that the diary entry forms a zone which itself contains two zones: one containing the date, and the other containing three lines about birds. The five lines of annotation in blue ink form another zone, to record which the diary page has been rotated 90° clockwise by the author. Using the elements discussed so far, the page might be transcribed as follows:
<ge:document>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300">

  <zone
    ulx="10"
    uly="43"
    lrx="185"
    lry="84"
    rotate="0">

   <zone>
    <ge:line rend="right">1 April 2009
    </ge:line>
   </zone>
   <ge:line>Fed Birds in the park today.</ge:line>
   <ge:line>Might write an article about </ge:line>
   <ge:line>the Thick-billed Warbler. </ge:line>
  </zone>
  <zone
    stage="#stage1"
    ulx="9"
    uly="20"
    lrx="70"
    lry="60"
    rotate="90">

   <ge:line>Samaria is a Greek </ge:line>
   <ge:line>brand of water that</ge:line>
   <ge:line>comes from the natural</ge:line>
   <ge:line>springs of Stilos, in </ge:line>
   <ge:line>Crete </ge:line>
  </zone>
 </surface></ge:document>
For comparison, here is a typical TEI transcription of the same page, focussing on its textual structure
<div type="diary-entry">
 <dateline>
  <date value="2009-04-01"> 1 April 2009 </date>
 </dateline>
 <p>
  <lb/>Fed Birds in the park today.<lb/> Might write an article about
 <lb/> the Thick-billed Warbler. </p>
</div>
<div type="note" rend="rotated">
 <p>
  <lb/>Samaria is a Greek <lb/> brand of water that <lb/> comes from the
   natural <lb/> springs of Stilos, in <lb/> Crete</p>
</div>

Is it possible to combine both perspectives within a single encoding? In general a document-based transcription, which is done page-by-page and possibly line-by-line, is almost certain to overlap with some part of a the text-based structure. The cleanest solution may be to encode both structures separately, providing both a document and a distinct text solution, perhaps using some form of external pointing to link the two, and minimizing redundancy of encoding by using XInclude. This option is further discussed below and also in the TEI Guidelines.

A further, and possibly simpler, approach is to apply to the textual elements such as p exactly the same kind of ‘flattening’ approach as has been applied to the line elements in the preceding example. Instead of marking the textual paragraphs as full elements, we mark only their frontiers, using the standard TEI milestone element, with the addition of a spanning attribute, as follows:
<surface
  ulx="0"
  uly="0"
  lrx="200"
  lry="300">

 <zone
   stage="#stage1"
   seq="0"
   ulx="10"
   uly="43"
   lrx="185"
   lry="84">

  <zone>
   <milestone unit="date" spanTo="#endDate"/>1 April 2009 <anchor xml:id="endDate"/>
  </zone>
  <milestone unit="p" spanTo="#p2"/>
  <ge:line>Fed Birds in the park today.</ge:line>
  <ge:line> Might write an article about </ge:line>
  <ge:line>the Thick-billed Warbler.</ge:line>
 </zone>
 <zone
   stage="#stage2"
   ulx="9"
   uly="20"
   lrx="70"
   lry="60"
   rotate="90">

  <milestone unit="p" xml:id="p2" spanTo="#end"/>
  <ge:line>Samaria is a Greek</ge:line>
  <ge:line>brand of water that</ge:line>
  <ge:line>comes from the natural</ge:line>
  <ge:line>springs of Stilos, in</ge:line>
  <ge:line>Crete</ge:line>
  <anchor xml:id="end"/>
 </zone>
</surface>
The written surfaces of which a document is composed are not always homogenous. In the following example, taken from the Walt Whitman archive, two pieces of newsprint have been glued to a piece of blue paper on which a poem is being drafted:
Image from http://www.whitmanarchive.org/resources/sleepers/duk.00258.001.jpg
Figure 1. Image from http://www.whitmanarchive.org/resources/sleepers/duk.00258.001.jpg
The two pieces of newsprint might perhaps be regarded as special kinds of zone, but they are effectively new surfaces, since they might contain additional written zones themselves (such as the numbers in this case). We therefore propose a distinct element, patch, which can appear within a surface, and behaves effectively like one, except that it contains specialized attributes to provide additional information.
  • patch contains a part of a written surface which was originally physically distinct but became attached to it at the time that one or more written zones were created on it.
    binder Describe the method by which a patch is or was connected to the main surface
    type characterizes the element in some sense, using any convenient classification scheme or typology.
    height height of the patch in mm
    width width of the patch in mm
Using this element, the Whitman draft above might be encoded as follows:
<surface>
 <zone>
  <ge:line>Poem</ge:line>
  <ge:line>As in Visions of — at</ge:line>
  <ge:line>night —</ge:line>
  <ge:line>All sorts of fancies running through</ge:line>
  <ge:line>the head</ge:line>
 </zone>
 <ge:patch
   type="newsprint"
   binder="glue"
   height="40"
   width="90">
Spring has just set in here, and the weather....
   a steamer
 <zone>
   <ge:metaMark function="sequence">2</ge:metaMark>
  </zone></ge:patch>
 <ge:patch
   type="newsprint"
   binder="glue"
   height="35"
   width="90">
"The shores on either side of the Sound are...
   The In-
 <zone>
   <metaMark function="sequence">3</metaMark>
  </zone></ge:patch>
</surface>
The <ge:metaMark> element used in this example is further discussed below ( 3.2.3 Metamarks)

3.2 Textual Alterations

Traces of authorial alteration (correction, addition, deletion, etc.) are frequently found within a single document, and may also be inferred when different documents are compared. It is however an open question as to whether inter-document discrepancies at the dossier level should be regarded in the same way as intra-document alterations. If two witnesses are collated, we may observe that a word present in one is missing from the other: does it necessarily follow that this is an addition or a deletion, which we would not hesitate to mark with an add or del tag if we are transcribing a single manuscript? We return to this question below.

In this section we discuss elements introduced for the markup of alterations at the document level, within a single document, complementary to the elements already provided for this purpose by the TEI scheme. We discuss specifically:
  • ‘meta-marks’, that is a kind of authorial markup present in the source and indicating how it should be read;
  • additions, where a passage has been rewritten to fix or clarify it;
  • deletions, where a passage had been struck through to indicate that it has been removed, or where a deletion has itself been cancelled
  • transpositions, where passages have been reorganized or resequenced.
The TEI already provides the following basic elements for transcription, which constitute the model.pPart.transcriptional class:
  • add (addition) contains letters, words, or phrases inserted in the text by an author, scribe, annotator, or corrector.
  • app (apparatus entry) contains one entry in a critical apparatus, with an optional lemma and at least one reading.
  • corr (correction) contains the correct form of a passage apparently erroneous in the copy text.
  • damage contains an area of damage to the text witness.
  • del (deletion) contains a letter, word, or passage deleted, marked as deleted, or otherwise indicated as superfluous or spurious in the copy text by an author, scribe, annotator, or corrector.
  • orig (original form) contains a reading which is marked as following the original, rather than being normalized or corrected.
  • reg (regularization) contains a reading which has been regularized or normalized in some sense.
  • restore indicates restoration of text to an earlier state by cancellation of an editorial or authorial marking or instruction.
  • sic (latin for thus or so ) contains text reproduced although apparently incorrect or inaccurate.
  • supplied signifies text supplied by the transcriber or editor for any reason, typically because the original cannot be read because of physical damage or loss to the original.
  • unclear contains a word, phrase, or passage which cannot be transcribed with certainty because it is illegible or inaudible in the source.
The present proposals extend this list by adding the following elements to the same class:
  • metaMark (meta mark) A textual or graphical element in a manuscript that is functional but not part of the text. Could transform the text, like a strikethrough, or provide meta-information, like a date.
    function describes the function (e.g. add, delete, alternate) of the mark.
    targets indicates the element(s) to which the function of the meta-mark refers. Pointers are separated by a white space
  • rewrite contains a sequence of text which has been rewritten by the author, for example by over-inking, to clarify or fix it.
    cause documents the presumed cause of the rewriting.
  • used/ In many cases, authors mark portions of text as having been used, usually meaning the text has been transcribed to a fair copy. The mark is often a strikethrough, but can be any author-specific mark.
    spanTo indicates the end of a span initiated by the element bearing this attribute.
  • undo/ Marks up an action represented by an element to be undone.
    spanTo indicates the end of a span initiated by the element bearing this attribute.
  • subst (substitution) groups one or more deletions with one or more additions when the combination is to be regarded as a single intervention in the text.
  • transposeGrp supplies a list of transpositions indicated at some point in the text, typically by means of metamarks.
  • transpose describes a single textual transposition as an ordered list of at least two pointers specifying the order in which the elements indicated should be re-combined.
Each of these new elements is discussed in the remainder of this section.

3.2.1 Additions and rewritings

A writer may sometimes rewrite material a second time without significant change and in the same place. We consider this a distinct activity from addition as usually defined because no new textual material results but the status of existing material changes. We distinguish two variants of this: fixation where the first version was a tentative draft which is subsequently fixed, for example by inking it over; and clarification, where the first version was badly written and has been rewritten for clarity. The element <ge:rewrite> is provided to cover both cases.

In this simple example, taken from the papers of Henrik Ibsen, the writer wrote the word skuldren hastily, and then returned to it to make the letter l larger and clearer:
Image from a ms of Peer Gynt, Collin 2869, 4°, I.1.1,
the Royal Library of Copenhagen
Figure 2. Image from a ms of Peer Gynt, Collin 2869, 4°, I.1.1, the Royal Library of Copenhagen
We might transcribe this word as follows:
<ge:line>...
Sku<ge:rewrite cause="unclear">l</ge:rewrite>dren
</ge:line>
The following example, taken from a manuscript of Jane Austen's Sanditon, shows a rewriting where a pencilled passage has been fixed with ink, with some modification:
Image from page 70 of the Sanditon manuscript
Figure 3. Image from page 70 of the Sanditon manuscript
In this example, Austen sees in the fixation an opportunity to manipulate the text previously written, and thus changes the pencilled could but get to the inked could get. A simple way of encoding this might be as follows:
<ge:rewrite cause="fix" hand="ja2" stage="#s1">Now, if we
could get
<del stage="0">but</del> a young Heiress</ge:rewrite>
A single rewrite may not be sufficient, and it may be that the document becomes almost unreadable as a result of repeated clarification. In the following example, for example, we can distinguish at least two attempts to write the letters er in the word bægerklang:
Image from http://www.emunch.no/tei-mm-2008/ms.html
Figure 4. Image from http://www.emunch.no/tei-mm-2008/ms.html
We might encode this by nesting the rewrite element as follows:
<ge:line>ved
Bæg<ge:rewrite cause="unclear" stage="#stage2">
  <ge:rewrite cause="unclear" stage="#stage1">er</ge:rewrite></ge:rewrite>
...</ge:line>
The stage attribute used here is discussed further below ( 3.3 Revision campaigns).

3.2.2 Deletions and marked as used

In general deletion in a source is marked using the del or delSpan element. However, it is useful to distinguish cases where a passage has been ‘indicated as superfluous or spurious in the copy text by an author, scribe, annotator, or corrector’ (TEI P5, s.v. del) from cases where a passage has been struck through or otherwise marked as having been used or copied to another location. In this latter case, the author does not intend to suppress the content, but only to mark that it has been transferred or reused. The element <ge:used> is provided to mark this kind of ‘deletion’.

The following page from the Walt Whitman archive has been crossed through to indicate used material:
Page from
http://www.whitmanarchive.org/resources/sleepers/20051105_0650.jpg
Figure 5. Page from http://www.whitmanarchive.org/resources/sleepers/20051105_0650.jpg
This page contains many internal deletions, but these should be distinguished from the ‘deletion’ signalled by the large cross, which actually shows that the page has been transferred or re-used, not deleted.
Material marked as re-used in this way often spans more than one zone or line. For that reason, the <ge:used> element is a spanning element, indicating the end of the used area by means of a spanTo attribute. We might encode the above page as follows:
<surface>
 <ge:used rend="cross" spanTo="#X2"/>
 <zone>
  <ge:line rend="underline">The Poet</ge:line>
  <ge:line>
   <del rend="strikethrough">I think</del> His sight is
     the</ge:line>
  <ge:line> sight of the ? and</ge:line>
  <ge:line>has sent the instinct of the</ge:line>
  <ge:line>? dog</ge:line>
 </zone>
 <zone>
  <ge:line>I think <ge:rewrite>ten</ge:rewrite> million</ge:line>
<!-- ... -->
  <ge:line>well; those <subst>
    <del rend="strikethrough">supple-fingered
         gods</del>
    <add>journeymen divine</add>
   </subst>.</ge:line>
  <anchor xml:id="X2"/>
 </zone>
</surface>

3.2.3 Metamarks

By metamark we mean marks such as numbers, arrows, crosses, or other symbols introduced by the writer into a document expressly for the purpose of indicating how the text is to be read. Such marks thus constitute a kind of markup of the document, rather than forming part of the text.

Unlike marginal notes or other additions to the text, meta-marks indicate a deliberate alteration of the writing (e.g. ‘move this passage over there’). We also consider as metamarks dates introduced to mark the beginning of a manuscript or a revision, but not forming part of it.

The <ge:metaMark> element carries a function attribute which specifies the function of the meta-mark and a targets attribute which points to the element or elements concerned.

The following example is taken from Kundige bok 2, a 15th century legal book from the city of Göttingen, containing regulations of everyday life issued by the city council
In the second paragraph., the sentence beginning ‘Ock en schullen de bruwere...’ was first written along with the word lege ("read") in the left hand margin, functioning as a metamark to indicate that this sentence forms part of the regulations. A further sentence was then added, while at some later stage or stages the text and also the metamark were deleted. We might encode this as follows:
<del>
 <ge:metaMark function="flag" targets="#s1">lege</ge:metaMark>
 <s xml:id="s1">Ock en schullen de bruwere des hilgen dages nicht
   over
 <lb/>setten noch uppe den stillen fridach bruwen.</s>
 <add>
  <s>Noch nymande
  <lb/>over setten, se en sehin denne erst, dat uppe den bonen
  <lb/>neyn stro noch, huw noch flaß ligghe, by pine eyner halven
  <lb/>roden, deme bruwere so wol alse dem bruwheren to murende.</s>
 </add>
</del>
Here are some further examples showing the use of this element, taken from the manuscript drafts of Thomas Moore's Lalla Rookh (1817). The first shows a simple use of a metamark used to take stock of the progress of composition:

At regular points throughout the various drafts of the work, a number occurs, usually in the right margin (in this instance, "100"). These numbers result from the author counting the number of verse lines he has composed to the given point, and are not part of the text, but represent a stage at which Moore is taking stock of the progress of his composition.

<surface>
 <zone>
<!-- main zone -->
  <ge:line>Be this she cried &amp; wing’d her flight</ge:line>
  <ge:line>My offering at the Gates of Bliss</ge:line>
  <ge:line>
   <del>Fully to know the odours <gap extent="1" unit="word" reason="illegible"/>
   </del></ge:line>
  <ge:line>
   <del>Tho foul to heaven the vapour went</del></ge:line>
  <ge:line>
   <del>From vulgar</del>
   <add>common</add>
   <del>victors</del>, blood like this</ge:line>
  <ge:line> Shed out for freedom, flows so bright. <ge:metaMark function="count">100</ge:metaMark></ge:line>
  <ge:line> It would not stain the purest <subst>
    <del>fount</del>
    <add>rill</add>
   </subst> .</ge:line>
  <ge:line>“That sparkles thro the fields of light. <del>
    <ge:metaMark function="count">100</ge:metaMark>
   </del></ge:line>
  <ge:line> Behold her in the skies again —</ge:line>
  <ge:line>
   <subst>
    <del>But</del>
    <add>And</add>
   </subst>, tho so fleet her pinions bore</ge:line>
  <ge:line> The spirit of the Warriors slain,</ge:line>
  <ge:line>Now reach’d &amp; pass’d the gates before her</ge:line>
 </zone>
 <zone>
<!-- left zone -->
  <ge:line> Tho foul too oft the</ge:line>
  <ge:line>tears that still</ge:line>
  <ge:line>Tho foul the droppings <add>weepings</add></ge:line>
  <ge:line>that distil</ge:line>
  <ge:line>Tho foul the tears that</ge:line>
  <ge:line>oft distil</ge:line>
  <ge:line>From glory’s faulchion’</ge:line>
 </zone>
</surface>
This example demonstrates the use of a common proof-correction mark: in the left margin of the page, adjacent to a group of three cancelled lines of verse, the word "stet" is written. "Stet", a Latin word meaning "let it stand" is commonly used by authors, editors and proofreaders where a previous action should be disregarded. In this instance, Moore is indicating that the three deleted verse lines should be let stand, as evidenced by their appearance in the first printed edition of Lalla Rookh. The word, "stet" does not form part of the text, but rather declares that a certain function be performed upon the text.
<surface>
 <zone>
  <ge:line>
   <gap extent="1" unit="word" reason="illegible"/>
   <del>in his light</del>
   <add>within</add> eyelids, within the spray</ge:line>
  <ge:line>From Eden’s fountain, when it lies</ge:line>
  <ge:line>
   <hi rend="underline">On that blue</hi>
   <del>before that</del> flower, which, Brahmins say</ge:line>
  <ge:line> Can only <add>blooms nowhere but</add> bloom in Paradise,</ge:line>
  <ge:line>
   <del xml:id="del1">“Nymph of a bright, fair but erring line!</del></ge:line>
  <ge:line>
   <ge:metaMark function="undo" targets="#del1 #del2 #del3">stet</ge:metaMark>
   <del xml:id="del2">(He gently <add>gently</add> he said) one hope is thine</del></ge:line>
  <ge:line>
   <del xml:id="del3">One hope (he gently said) is thine</del></ge:line>
 </zone>
</surface>
In this example two alternative readings are provided, without either one being prioritised or subordinated. While the author apparently first composed the line "Alone before his native river -", at some later point, he entertained the possibility of using the word "beside" instead of "before." In the context of this manuscript, there is no indication of which word the Moore favours, so the status of these words as possible alternative readings needs to be encoded. The evidence of the first edition of Lalla Rookh shows that the word "beside" was chosen, but for the purposes of encoding this manuscript, the facility to encode two equally-possible alternative readings needs to be available.
<zone>
 <ge:line>Alone <seg type="alternative" xml:id="alt1">before</seg>
  <add place="above" type="alternative" xml:id="alt2">beside</add> his native river ­—</ge:line>
 <alt targets="#alt1 #alt2" mode="excl" weights="0 1"/>
</zone>
Other examples of <ge:metaMark> can be seen in marked-up proofs such as the following, taken from the Walt Whitman archive:
http://www.whitmanarchive.org/resources/sleepers/loc.00295.jpg
Figure 6. http://www.whitmanarchive.org/resources/sleepers/loc.00295.jpg

3.2.4 Transpositions

Metamarks are commonly used in the context of transposition, that is, the moving of words or blocks by the author to a different position using arrows, asterisks or numbers or other metamarks. One possible approach (used, for instance in HNML) would be to regard such transpositions as a special kind of substitution, and actually to represent the result of the transposition indicated by the metamarks in the encoding, for example by considering the segment previous to the transposition as deleted, and substituted by the one after the transposition.

Our recommendation is to record the actual state of the witness, but in such a way as to facilitate its reorganization as a distinct processing step. We propose to represent the re-alignment of transposed blocks or segments by means of a stand-off mechanism. The elements <ge:transposeGrp> and <ge:transpose> are provided for this purpose. For example, in the following extract from an Ibsen manuscript
Extracted from
Figure 7. Extracted from http://www.emunch.no/tei-mm-2008/ms.html
, the underlined numbers 1 and 2 indicate that, although the word bör precedes the word hör in the text, the order of the two words should be reversed. We may encode this as follows:
<ge:line>
 <seg xml:id="ib01">bör</seg>
 <ge:metaMark
   rend="underline"
   function="transposition"
   targets="#ib1"
   place="above">
2.</ge:metaMark> og <seg xml:id="ib02">hör</seg>
 <ge:metaMark
   rend="underline"
   function="transposition"
   targets="#ib02"
   place="above">
1.</ge:metaMark></ge:line>
<ge:transposeGrp>
 <ge:transpose>
  <ptr target="#ib02"/>
  <ptr target="#ib01"/></ge:transpose></ge:transposeGrp>
Note the use of the generic seg element to identify the sections of text being transposed. When (as in the following example) the whole of line is to be transposed, there is no need to delimit the sections concerned:
Extracted from
Figure 8. Extracted from http://www.emunch.no/tei-mm-2008/ms3.html
<ge:line xml:id="ib3">
 <ge:metaMark function="transposition" place="margin-left">2.)</ge:metaMark>
thi da er du med Himmelen i Pagt; — </ge:line>
<ge:line xml:id="ib4">
 <ge:metaMark function="transposition" place="margin-left">1.)</ge:metaMark>
da kan du Folkets Jøkelhjerter tine;</ge:line>
<ge:transposeGrp>
 <ge:transpose>
  <ptr target="#ib4"/>
  <ptr target="#ib3"/></ge:transpose></ge:transposeGrp>
When transposition is made, the whole element indicated is understood to be moved, not just its contents. In the above example, the metamarks are thus understood to be moved along with the lines to which they apply.

One or more transposeGrp elements may be supplied either embedded within the text or in the profileDesc of the header, depending on local preference. Each transposeGrp can contain one or more transpose element, each of which defines a single transposition.

3.2.5 Substitution

In the current model for the TEI subst element, one or more additions and deletions may be combined if they are considered as representing a single editorial act, a substitution. Without extension, this model could not therefore include cases such as the following example taken from Thomas Moore's Lalla Rooke
Here the word pondering is deleted, and the phrase she mus'd are added, while the word thus remains unchanged. It seems appropriate to treat all of this as a single substitution. This would require a modification to the content model of subst so as to permit text along with other members of model.pPart.transcriptional, so that this example could be encoded as follows:
<ge:line>While <subst>
  <del>pondering</del> thus <add>she
     mus'd</add>
 </subst>, her pinions fann'd</ge:line>

3.2.6 Undoing alterations

In some cases an author indicates that an alteration is itself to be altered: for example, a struck through passage may be restored via a dotted underlining, or the underlining of a passage may be deleted by a wavy line.

The TEI provides an element restore for one specific kind of alteration to an alteration, namely the undoing of a deletion. We propose a more general element, undo. The element <ge:undo> usually encloses the element (e.g. the add, del etc.) to be undone. If it appears within such an element, the implication is that only this part of the parent element has been cancelled. For example, in this passage taken from Giacomo Leopardi's Zibaldone (p. 3595), the phrase si rechi á was underlined word by word, and then the underlining of the word si was cancelled.
This could be encoded as follows:
<ge:line> che e’ <hi rend="underline">
  <ge:undo spanTo="#x2"/>si <anchor xml:id="x2"/> rechi a’</hi>
 <del rend="overstrike">dotti</del>
 <hi rend="underline">denti</hi> l’un d’essi cibi</ge:line>

3.3 Revision campaigns

A major purpose of genetic editing is the identification of ‘revision campaigns’ or, more generally, stages. A genetic editor needs to be able to assign a set of alterations (deletions, additions, substitutions, transpositions, etc.) to a particular revision stage, to indicate both that one or more of such phenomena preceded or followed another and also to indicate that they are related in some way, for example that one is a consequence of the other. To document this we need:
  • a system to attribute an alteration or other phenomenon occurring at document level to a particular revision campaign or text-stage
  • a way to characterize or describe a revision campaign

This has obvious similarities to the existing revisionDesc element, but concerns the source document or set of documents rather than the TEI document representing them. We therefore propose using the existing change element for the purpose of documenting individual text stages. The existing element creation (within the TEI Header profile description) is defined as the appropriate location for all information relating to the genesis or production of a text; we might therefore modify it slightly to permit a new stageHist element which contains a number of change elements, one for each identified stage. This would also be closely analogous to the existing recordHist element, which documents changes in the catalogue record related to an artefact, as well as the revisionDesc which documents changes in the digital artefact itself.

The order of change elements within the stageHist will normally be given from the earliest to latest, where this is known. The existing change element carries a number of attributes from the att.datable class (period, when, notBefore, notAfter, from, and to) which allow each stage to be dated as exactly or inexactly as necessary, in the same way as is currently possible for the TEI date element.

Typically, each change element will contain references to other annotations contained within the teiHeader or in the document, but its contents are purely documentary.

<profileDesc>
 <creation>
  <date notAfter="1816-07-18"/>
  <ge:stageHist>
   <change xml:id="mod1" when="1816-07-16">The first draft of
   <title>Persuasion</title> is completed by the <date>July 16
         1816</date> written after the word <q>Finis</q> at <ref target="#pers-30">page 30</ref>.</change>
   <change xml:id="mod2" notBefore="1816-07-16"> After the <date>16th of July</date>
       Austen starts revision of the two final chapters, by rewriting
       the end and adding a new block (<ref target="#transp-1">pages
         32-35</ref>) to be inserted at <ref target="#insertion-p1">page 19</ref>. This stage is documented by the deletion of
       the date (<date>July 16 1816</date>) at <ref target="#pers-30">page 30</ref>, and the addition of more text and of a new
       date (<date>July 18. 1816</date>) at <ref target="#pers-31">page
         31</ref>
   </change>
   <change notBefore="1816-07-18">Before publication, after <date>July 18th, 1816</date>
       chapters 10-11 were broken into three chapters, 10, 11, 12, as
       witnessed by the print.</change></ge:stageHist>
 </creation>
</profileDesc>

The targets of the various pointers (transp-1, insertion-1 etc.) in the above example may be any part of the transcribed document which has been marked up and allocated an identifier, such as the pages or insertion points mentioned above. The former will presumably be marked as <ge:surface> elements or zone elements, while the latter may be marked using the generic TEI anchor element.

Alternatively, or in addition, we propose a generic state attribute which can point in the opposite direction, and associate any sequence of mark-up in a document with a change element, thus allocating that particular writing event to a particular revision campaign.

Because a typical revision campaign will comprise very many individual modifications (possibly hundreds) an element called mod (for modification) is proposed as a means of delimiting the scope of all modifications to be assigned to a given change. This is a milestone-like (empty) element, placed at the start of the text affected, and indicating the end of that range by means of an spanTo attribute.

<ge:line>her face) – <ge:mod spanTo="#ch10-06-23" stage="#mod1"/>But <subst>
  <del rend="overstruck">I do not</del>
  <add place="above">you have not</add>
 </subst> see much</ge:line>
<ge:line>the Look of it <del rend="overstruck">in your
   Countenance."</del></ge:line>
<ge:line>
 <add place="above">as Grave as a little Judge."</add>,<anchor xml:id="ch10-06-23"/> – Anne blushed. – Aye, aye, that</ge:line>
In this example, the deletion of I do not, the addition of you have not, the deletion of in your Countenance, and the addition of as Grave as a little Judge are all considered part of a single text-stage. The text stage itself is documented by a change element with identifier mod1, located in a stageHist element elsewhere.

In a case like this, there is no particular assertion about the order in which any of the various modifications making up this revision campaign were effected. If such detailed analysis is required, the existing seq attribute may be used to supply a sequence number. For example, if there are two additions within a given stage, and it is clear that one precedes the other, this could be indicated by giving the earlier one a seq attribute with the value 1 and the later one a seq attribute with the value 2.

The use of tags such as del and add necessarily implies that the modification concerned was made at some time after the original writing. An exception to this is where a false start or ‘instant’ correction has been identified: the author starts to write, and then immediately corrects what has been written. A special mechanism is provided for this case: the seq attribute may take the value 0 to indicate that the addition or deletion is considered to belong to the same writing stage as the rest of the unmodified document.

An example of false start can be seen in the following line:
http://www.whitmanarchive.org/resources/sleepers/uva.00256.001.jpg
Figure 9. http://www.whitmanarchive.org/resources/sleepers/uva.00256.001.jpg
in which we can detect the following sequence of events:
  1. The letter T is written and then immediately deleted
  2. The word The is written, deleted, and replaced by the word His
  3. The added word His is then deleted
  4. The initial letter i of the words iron necklace is overwritten with a capital I
To indicate that the first of these acts must have taken place before the others, we might encode this revision campaign as follows:
<ge:line>
 <del seq="0">T</del>
 <subst seq="1">
  <del>The</del>
  <add place="above">
   <del rend="overstrike">His</del>
  </add>
 </subst>
 <subst seq="2">
  <del rend="overwritten">i</del>
  <add place="superimposed">I</add>
 </subst>ron necklace</ge:line>
In a set of revisions like the Austen example cited above, there is an implicit assumption that acts of deletion and addition are all assigned to the same writing stage, which is specified by the most recently stated mod element. If this is not the case, the stage attribute may also be used on any element to associate it with some other writing stage than that implied by the <ge:mod> which ‘surrounds’ it. For example, if it were determined that in the previous example the substitution of "I do not' by "you have not' should actually be assigned to some different state (say mod3) the above example might be encoded:
<ge:line>her face) – <ge:mod spanTo="#ch10-06-23" stage="#mod1"/>But
<subst stage="#mod3">
  <del rend="overstruck">I do not</del>
  <add place="above">you have not</add>
 </subst> see much</ge:line>
Similarly, there is no reason (other than increased complexity) why mod spans should not be nested within each other, where for example several revision campaigns have been identified which are applicable to overlapping sequences of elements.

4 The dossier level

The term dossier is used to refer to the set of documents which a genetic editor considers as having contributed to the evolution of a particular text. These may include drafts, revisions, or documents related in other ways. Since each such document will most probably be encoded as a distinct TEI document with its own TEI Header, the natural way to encode a dossier would be to use the existing teiCorpus element. This would provide a TEI Header in which metadata regarding the organization of the dossier itself can be recorded, independently of the metadata regarding each particular document contained within it, which would be held in a discrete TEI Header attached to that particular encoded document.

The XInclude mechanism provides a convenient means of managing the many separate files which are likely to be needed to constitute a complete dossier. For example, supposing we have a dossier comprising three documents each of which has been encoded in its own file:
<teiCorpus><teiHeader>
<!-- information about the dossier -->
</teiHeader>
<xi:include href="document1.xml"/>
<xi:include href="document2.xml"/>
<xi:include href="document3.xml"/>
</teiCorpus>
Note that each of the files referenced (document1.xml etc) should contain a complete TEI element

4.1 Genetic Grouping

Looking at the documents which constitute a given dossier, there are many types of relationships which can be identified, both amongst complete documents, and amongst parts of those documents, including even alterations, revisions and other compositional phenomena. A further complexity arises if for example an author chooses to correct two different versions at the same time, 3 . We may thus need to express that two or more documents are related in different ways; for instance, one document may be the sequel of another, one may have been drafted at the same time as another, one may contain material or treat topics related to those of another, for example a newspaper article may inspire or be quoted by a given work.

We propose a <gi:geneticGrp> element as a way of formally identifying any kind of relationship which has been identified amongst the components of a dossier. Specific instances of this relationship are identified by means of a <ge:geneticNote> element.
  • geneticGrp Group texts and document which are somehow related in a genetic process
  • geneticNote describes a particular set of documents or document fragments which are considered to be mutually associated in some way.
  • The work of Kenneth Price and Brett Barney on the genesis of a Walt Whitman poem eventually titled The Sleepers 4 may be considered an example of genetic grouping. They consider fifteen different documents composed over 30 years, some of which contains only vague thematic resemblances to the poem, while others are more strongly related. In the following examples we assume each of these documents has been encoded as a distinct document which can be accessed by means of a pointer value such as #poem, #sweet_flag etc. We can identify different kinds of grouping amonst these documents, for instance:
    • thematic: one document (‘#curse1’) is a draft for a poem about night visions which the editors think is related to the poem (#poem); another document (‘#sweet_flag’) contains ideas which reappear in ‘#curse1’)
    • evolutionary: one document (or part of a document) is clearly a draft version of another; such relationships can form a complex network through which one can trace a path terminating in a ‘final’ or published version of a text.
    • type of document: documents of particular types (printed, notebook, etc.) 5
    To indicate that one document (with xml:id value poem) and another (with xml:id value sweet_flag) are thematically related, we might simply add a geneticNote like the following:
    <profileDesc>
     <ge:geneticGrp>
      <ge:geneticNote type="thematic">
       <linkGrp>
        <link targets="#curse1 #sweet_flag"/>
        <link targets="#poem #curse1"/>
       </linkGrp>
       <p>documents that contain ideas and suggestions which
           also appear in the poem</p></ge:geneticNote></ge:geneticGrp>
    </profileDesc>

    Here the standard TEI link element has been used to point to the documents which are related in some way. The type attribute could be used to distinguish nuances of relationship.

    4.2 Genetic Relations

    By ‘genetic relations’ we mean the ordering of the different text-stages represented, either within a single document or more probably, across different documents, into a hypothetical line of development, going, for instance from a version A to a version B (that can be represented by a different document or by an editorially reconstructed text-stage), and then to a version C, etc.

    While a <ge:geneticNote> simply describes what a group of documents have in common, a genetic relation will tries to organise them into an idealised genetic or evolutionary line. The TEI offers a number of generic methods for representing such structures in the P5 chapter on graphs networks and trees, from which we adopt the idea of representing genetic relations as directed acyclic graphs

    A graph is a structure composed of many nodes and arcs. Each node represents one document, document component, or revision stage (as defined in 3.3 Revision campaigns above), and each arc represents a connexion of some kind between two nodes. Arcs may be typed to distinguish different kinds of relationship. For our purposes, the graph is directed, because we wish to represent a particular path through it, and acyclic, because a given node can appear at one point in the graph, although it may of course be used by many other different nodes. The graph defined by a genetic relation resembles a family tree, in that there is a single terminal node, representing the final state, with many preceding nodes linking to it, either directly or via other nodes. 6

    The number of possible graphs that might be drawn for a given dossier is not of course limited in any way: the encoder may derive as many as they wish. They may also wish to represent other forms of syntagmatic structure, for narratalogical or other purposes, using essentially the same mechanism.

    The following elements are used to represent a graph:
    • graph encodes a graph, which is a collection of nodes, and arcs which connect the nodes.
    • arc encodes an arc, the connection from one node to another in a graph.
    from gives the identifier of the node which is adjacent from this arc.
    to gives the identifier of the node which is adjacent to this arc.
  • node encodes a node, a possibly labeled point in a graph.
    value provides the value of a node, which is a feature structure or other analytic element.
    adjTo (adjacent to) gives the identifiers of the nodes which are adjacent to the current node.
    adjFrom (adjacent from) gives the identifiers of the nodes which are adjacent from the current node.
The simplest way of representing a directed graph requires use of only the node and graph elements. For example, to represent a graph in which we say that text A derives directly from two drafts B and C, and that C derives from another document D, we might draw a graph like the following:
<graph>
 <node xml:id="N1" value="#A-text" adjTo="#N2 #N3"/>
 <node xml:id="N2" value="#B-draft" adjFrom="#N1"/>
 <node
   xml:id="N3"
   value="#C-draft"
   adjFrom="#N1"
   adjTo="N4"/>

 <node xml:id="N4" value="#D-draft" adjFrom="#N3"/>
</graph>
In this simple version, each node specifies the document which it represents by means of the pointer supplied as its value attribute. Each node is also given its own unique identifier, which is then specified as value for the adjTo and adjFrom attributes as needed to represent the arcs joining nodes together.
If we wish to attach additional information to the arcs, for example to add labels specifyinf the nature of the link between individual pairs of nodes, we might represent the same structure rather differently as follows:
<graph>
 <node xml:id="NN1" value="#A-text"/>
 <node xml:id="NN2" value="#B-draft"/>
 <node xml:id="NN3" value="#C-draft"/>
 <node xml:id="NN4" value="#D-draft"/>
 <arc from="#NN1" to="#NN2 #NN3">
  <label>Final-merge</label>
 </arc>
 <arc from="#NN3" to="#NN4">
  <label>First-draft</label>
 </arc>
</graph>
The note element may be used as elsewhere to attach any additional information describing either the node or the arc in more detail, as in the following extended example, which represents a part of the Whitman dossier discussed earlier:
<graph type="genetic">
 <node xml:id="A" value="#poem"/>
 <node xml:id="B" value="#sweet_flag"/>
 <node xml:id="C" value="#curse1">
  <note>
   <title>I am a curse</title>: an early manuscript
     draft of the "Lucifer" section of the poem that likely
     lead to the 1855 printed version</note>
 </node>
 <node xml:id="D" value="#efflux"/>
 <node xml:id="E" value="#shroud">
  <note>A manuscript containing approximately seven lines, lightly
     revised, of the poem eventually titled "The Sleepers."</note>
 </node>
 <node xml:id="F" value="#curse2">
  <note>The second notebook, <title>No doubt the efflux of the
       soul</title>, is a longer one (24 leaves) that lays out the
     philosophical ideas that generate the poem and produces some of the
     key images in the first section of the poem.</note>
 </node>
 <node xml:id="G" value="#topple_down"/>
 <node xml:id="H" value="#black_lucifer"/>
 <node xml:id="Z" value="#Leaves81-82"/>
 <arc from="#C" to="#F" type="evolution"/>
 <arc from="#F" to="#G" type="#evolution"/>
 <arc from="#G" to="#H" type="#evolution"/>
 <arc from="#D" to="#Z" type="#evolution">
  <label>First section</label>
 </arc>
 <arc from="#E" to="#Z" type="#evolution">
  <label>Central section</label>
 </arc>
 <arc from="#H" to="#Z" type="#evolution">
  <label>Lucifer</label>
 </arc>
</graph>
The graph can also be represented graphically as follows:

5 *Collation and Critical Apparatus

As noted above, not all kinds of variation within and between documents are equivalent. For example, most people would regard authorial modifications within a single draft or between subsequent drafts as having a different significance from modifications assigned to scribal variation within a long textual tradition, despite their formal similarities.

When a passage has been visibly deleted in one version of a text we will generally mark it explicitly; if however a passage present in one version (A) is omitted in another (B), it may be a matter of uncertainty as to whether it has been deleted from B, or added to A. Even if this is certain (perhaps because the order of the two versions is known), the omission from B of material in A is not entirely the same phenomenon as an explicit deletion.

The addition (or deletion) of a segment from a version is normally a deliberate act of the author and we would like to be able to record that in positive way; whether we need another set of editorial elements or we should use the same set that are used for transcription remains an open question.

Identifying additions or deletions on the basis of a comparison of different versions of a text is possible, using existing TEI elements for critical editing such as app and its child rdg elements. This method uses the argument e silentio: for example, to identify that something is missing from a witness, all available readings must be compared, and there is no way of explicitly marking an absent (or additional) reading. For instance, the 1856 edition of Leaves of Grassof the 1856 omits the words ‘all, all’ which are included in the 1881-82 version. We may record this as an apparatus:
<l n="22">And the enraged and treacherous dispositions
<app>
  <rdg wit="#Leaves81-82">, all, all</rdg>
  <rdg wit="#Leaves56"/>
 </app> sleep</l>
but this does not indicate whether the words were deleted consciously from the 1856 dedition, or added in the 1881 version. Furthermore, if we decided to use the existing add or del elements within the rdg, for example:
<l n="22">And the enraged and treacherous dispositions
<app>
  <rdg wit="#Leaves81-82">
   <add>, all, all</add>
  </rdg>
  <rdg wit="#Leaves56"/>
 </app>...</l>
the result would be ambiguous: it might indicate that there is an explicit addition (for example, by interlinear or marginal interpolation) within the 1881 text, or it might indicate that this addition appears as a result of collating the 1881 and 1856 texts.
One solution might be to use a different element (say <ge:interAdd>) for addition implied by collation, reserving add for deletions that happen at the document level. Another, if all the documents have been fully transcribed, might be to use stand off techniques to represent the collation. This is a more promising possibility which the workgroup has not yet fully explored. If all the alterations occurring at document level are already encoded within each transcription, the dossier-level collation will only need to point to passages within the separate files and classify the types of readings resulting from the collation:
<app>
<rdg wit="#Leaves81-82">
<span from="#v22-5" to="#v22-8" type="add"/></rdg>
<rdg wit="#Leaves56">
<span from="#v22-5" type="del"/></rdg></app>

6 Manuscript and Dossier Levels

6.1 Chronology, Date and Time

The chronology (timing) of parts of a document or documents may be expressed in absolute terms (at such a time on such a date), or relatively (before or after some other event). Relative time can also be expressed by the relation to the (known or unknown) creation of another document or text. Dating can be justified by prose and/or by reference to a characteristic of a manuscript (e.g. hand, ink, etc.). The outlining of a chronology for a document or dossier can then be used as an argument to determine the existence of a text-stage.

6.1.1 Absolute Time

For absolute timing we envisage two cases that can occur separately or in combination:
  • The dating of a witness (a manuscript part of a dossier) or of a division is not directly documented within its content, but can be deduced from external facts and documents. The editorial date attribution should be included in the teiHeader, within the creation element.
  • Dates can be found in the document added by the author as a kind of metadata (for instance the date of the beginning/ending of writing on a specific document, or the date of a revision campaign); those should be encoded without necessarily adding them in the teiHeader (they can be considered a type of meta-mark, as previously discussed ( 3.2.3 Metamarks)

The TEI provides a timeline element which can be used to define a scale, a co-ordinate system for measuring time. This enables us to align other components of a document with particular points in time, each such point being represented by a when element. The temporal inter-relation of when elements is expressed by attributes stating, for example, that this point in time is so many hours or years after another, or absolutely using a standardized notation for date and time (see further discussion in the Guidelines). The alignment is done using an attribute such as sync, to state that a given part of the text is aligned with a given point in time.

This mechanism was developed originally for the representation of transcribed speech, in particular to support overlap and discontinuity at a fairly fine-grained level. It is not clear to what extent it can be generalised to support the comparatively coarse grained and imprecise notions which typify analysis of textual genesis. In particular, the need to express alternative and uncertain temporal sequences remains problematic.

One option might be to define a specific element such as <ge:evolution> comprising a series of pointers to change elements, organized in such a way as to express alternative or varyingly certain views about their sequence, possibly using the existing TEI alt element, and existing featurs for indicating degrees of uncertainty. Such pointers could also be synchronized with an external timeline using the existing TEI mechanisms if this was thought useful. The Workgroup has not completed work on elaborating these proposals however.

6.2 Documenting Editorial Decisions

Genetic editing is an essentially interpretative process; documentation of all editorial decisions is conseqently of major importance.

Different forms of documentation have been discussed before in several previous sections:
  • <ge:stageNote>: documents the constitution of a text-stage
  • <ge:geneticNote>: documents the constitution of a genetic group
All these elements carry a cert attribute to declare the level of certainty, and a resp attribute to declare the editorial responsibility of an annotation; all of them should use xml:id to be pointed at from many places in the text.

Annotations can also occur in-line (i.e. close to a textual fragment they relate to) and in many other places; the existing note element should be used to record these.

Appendix A ODD

TEI Extension for Genetic Editions -- preliminary version

Schema geneticTEI: changed components

AnyThing

AnyThing Matches any element
Module derived-module-geneticTEI
Used by
Declaration
AnyThing =
   (
      element * { attribute * - (xml:id | xml:lang) { text }*, AnyThing }
    | text
   )*

att.staged

att.staged groups elements which can be assigned to a specific text stage by means of the attributes it provides.
Module tei
Members att.transcriptional [add addSpan del delSpan restore rewrite subst] line metaMark mod undo used zone
Attributes In addition to global attributes
stage points to a <stageNote> which contains a description of a text-stage to which the editors think the alteration marked by the element bearing this attribute (and its children) belongs.
Status Optional
Datatype xsd:anyURI | "0"
Note
The value 0 indicates that the element concerned is considered to belong to the same stage as its sibling elements.

<document> [http://www.tei-c.org/ns/geneticEditions]

<document> contains a document-centric transcription of a primary source, providing topographical information as well as transcription
Module derived-module-geneticTEI
In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain
transcr: surface
Declaration
element document { att.global.attributes, surface+ }

<fallback> [http://www.w3.org/2001/XInclude]

<fallback> Wrapper for fallback elements if an XInclude fails
Module derived-module-geneticTEI
Used by
May contain Empty element
Declaration
element fallback { AnyThing }

<geneticGrp> [http://www.tei-c.org/ns/geneticEditions]

<geneticGrp> Group texts and document which are somehow related in a genetic process
Module derived-module-geneticTEI
In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain
derived-module-geneticTEI: geneticNote
Declaration
element geneticGrp { att.global.attributes, geneticNote+ }

<geneticNote> [http://www.tei-c.org/ns/geneticEditions]

<geneticNote> describes a particular set of documents or document fragments which are considered to be mutually associated in some way.
Module derived-module-geneticTEI
In addition to global attributes att.typed (@type, @subtype) att.editLike (@evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain
core: p
linking: ab linkGrp
Declaration
                        element 
                        geneticNote
{
   att.typed.attributes,
   att.editLike.attributes,
   att.global.attributes,
   linkGrp+,
   model.pLike+
}

<include> [http://www.w3.org/2001/XInclude]

<include> The W3C XInclude element
Module derived-module-geneticTEI
In addition to global attributes In addition to global attributes
href pointer to the resource being included
Status Optional
Datatype xsd:anyURI
parse
Status Optional
Legal values are:
xml
[Default]
text
xpointer
Status Optional
Datatype text
encoding
Status Optional
Datatype text
accept
Status Optional
Datatype text
accept-charset
Status Optional
Datatype text
accept-language
Status Optional
Datatype text
Used by
May contain
derived-module-geneticTEI: fallback
Declaration
                        element 
                        include
{
   attribute href { xsd:anyURI }?,
   attribute parse { "xml" | "text" }?,
   attribute xpointer { text }?,
   attribute encoding { text }?,
   attribute accept { text }?,
   attribute accept-charset { text }?,
   attribute accept-language { text }?,
   fallback?
}

<line> [http://www.tei-c.org/ns/geneticEditions]

<line> contains the transcription of a topographic line in the source document
Module derived-module-geneticTEI
In addition to global attributes att.staged (@stage) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain
Declaration
                        element 
                        line
{
   att.staged.attributes,
   att.global.attributes,
   (
      text
    | model.globalmodel.pPart.transcriptionalmodel.pPart.editorialmodel.segLikemodel.gLikemodel.hiLike
   )*
}

<metaMark> [http://www.tei-c.org/ns/geneticEditions]

<metaMark> (meta mark) A textual or graphical element in a manuscript that is functional but not part of the text. Could transform the text, like a strikethrough, or provide meta-information, like a date.
Module derived-module-geneticTEI
In addition to global attributes att.spanning (@spanTo) att.placement (@place) att.staged (@stage) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
function describes the function (e.g. add, delete, alternate) of the mark.
Status Optional
Datatype token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" }
targets indicates the element(s) to which the function of the meta-mark refers. Pointers are separated by a white space
Status Optional
Datatype 1–∞ occurrences of  xsd:anyURI separated by whitespace
Used by
May contain
Declaration
                        element 
                        metaMark
{
   att.spanning.attributes,
   att.placement.attributes,
   att.staged.attributes,
   att.global.attributes,
   attribute function { token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" } }?,
   attribute targets { list { xsd:anyURI+ } }?,
   macro.specialPara
}

<mod> [http://www.tei-c.org/ns/geneticEditions]

<mod> defines the scope of an area in the document containing several alterations which are considered as belonging to the same revision campaign.
Module derived-module-geneticTEI
In addition to global attributes att.spanning (@spanTo) att.staged (@stage) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain Empty element
Declaration
                        element 
                        mod
{
   att.spanning.attributes,
   att.staged.attributes,
   att.global.attributes,
   empty
}

model.zonePart

model.zonePart elements which can form part of a zone
Module derived-module-geneticTEI
Used by
Members line zone

<patch> [http://www.tei-c.org/ns/geneticEditions]

<patch> contains a part of a written surface which was originally physically distinct but became attached to it at the time that one or more written zones were created on it.
Module derived-module-geneticTEI
In addition to global attributes att.coordinated (@start, @ulx, @uly, @lrx, @lry) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) att.typed (@type, @subtype)
binder Describe the method by which a patch is or was connected to the main surface
Status Optional
Datatype xsd:Name
Sample values include:
glue
patch is glued in place
pin
patch is pinned or stapled in place
sewn
patch is sewn in place
flipping indicates whether the patch is attached and folded in such a way as to provide two writing surfaces
Status Optional
Datatype xsd:boolean
height height of the patch in mm
Status Optional
Datatype xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
width width of the patch in mm
Status Optional
Datatype xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
Used by
May contain
Declaration
                        element 
                        patch
{
   att.coordinated.attributes,
   att.global.attributes,
   att.typed.attributes,
   attribute binder { xsd:Name }?,
   attribute flipping { xsd:boolean }?,
   attribute 
                        height
   {
      xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
   }?,
   attribute 
                        width
   {
      xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
   }?,
   ( text | zone | model.global )*
}

<rewrite> [http://www.tei-c.org/ns/geneticEditions]

<rewrite> contains a sequence of text which has been rewritten by the author, for example by over-inking, to clarify or fix it.
Module derived-module-geneticTEI
In addition to global attributes att.spanning (@spanTo) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) att.transcriptional (@hand, @status, @seq) (att.staged (@stage)) (att.editLike (@evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) )
cause documents the presumed cause of the rewriting.
Status Optional
Datatype xsd:Name
Legal values are:
fix
rewriting for the purpose of fixation
unclear
rewriting to clarify a previously illegible or badly written sequence
Used by
May contain
Declaration
                        element 
                        rewrite
{
   att.spanning.attributes,
   att.global.attributes,
   att.transcriptional.attributes,
   attribute cause { "fix" | "unclear" }?,
   macro.paraContent
}
Note
Multiple rewritings are indicated by nesting one rewrite within another. In principle, a rewriting differs from a substitution in that second and subsequent rewrites do not materially alter the content of an element. Where there are minor changes made during the rewriting however these may be marked up using del, add, etc. with an appropriate value for the stage attribute.

<stageHist> [http://www.tei-c.org/ns/geneticEditions]

<stageHist> contains one or more descriptions of the stages which have been identified in the genesis of a text.
Module derived-module-geneticTEI
In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain
core: p
header: change
linking: ab
msdescription: summary
Declaration
                        element 
                        stageHist
{
   att.global.attributes,
   ( model.pLike+ | ( summary?, change+ ) )
}

<transpose> [http://www.tei-c.org/ns/geneticEditions]

<transpose> describes a single textual transposition as an ordered list of at least two pointers specifying the order in which the elements indicated should be re-combined.
Module derived-module-geneticTEI
In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain
core: ptr
Declaration
element transpose { att.global.attributes, ( ptr, ptr+ ) }

<transposeGrp> [http://www.tei-c.org/ns/geneticEditions]

<transposeGrp> supplies a list of transpositions indicated at some point in the text, typically by means of metamarks.
Module derived-module-geneticTEI
In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain
derived-module-geneticTEI: transpose
Declaration
element transposeGrp { att.global.attributes, transpose+ }

<undo> [http://www.tei-c.org/ns/geneticEditions]

<undo> Marks up an action represented by an element to be undone.
Module derived-module-geneticTEI
In addition to global attributes att.spanning (@spanTo) att.staged (@stage) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
target The element representing the action to be undone.
Status Optional
Datatype xsd:anyURI
Used by
May contain Empty element
Declaration
                        element 
                        undo
{
   att.spanning.attributes,
   att.staged.attributes,
   att.global.attributes,
   attribute target { xsd:anyURI }?,
   empty
}

<used> [http://www.tei-c.org/ns/geneticEditions]

<used> In many cases, authors mark portions of text as having been used, usually meaning the text has been transcribed to a fair copy. The mark is often a strikethrough, but can be any author-specific mark.
Module derived-module-geneticTEI
In addition to global attributes att.spanning (@spanTo) att.staged (@stage) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs))
Used by
May contain Empty element
Declaration
                        element 
                        used
{
   att.spanning.attributes,
   att.staged.attributes,
   att.global.attributes,
   empty
}

Schema geneticTEI: unchanged components

TEI: (TEI document) contains a single TEI-conformant document, comprising a TEI header and a text, either in isolation or as part of a teiCorpus element.
ab: (anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph.
abbr: (abbreviation) contains an abbreviation of any sort.
accMat: (accompanying material) contains details of any significant additional material which may be closely associated with the manuscript being described, such as non-contemporaneous documents or fragments bound in with the manuscript at some earlier historical period.
acquisition: contains any descriptive or other information concerning the process by which a manuscript or manuscript part entered the holding institution.
actor: Name of an actor appearing within a cast list.
add: (addition) contains letters, words, or phrases inserted in the text by an author, scribe, annotator, or corrector.
addSpan: (added span of text) marks the beginning of a longer sequence of text added by an author, scribe, annotator or corrector (see also add).
additional: groups additional information, combining bibliographic information about a manuscript, or surrogate copies of it with curatorial or administrative information.
additions: contains a description of any significant additions found within a manuscript, such as marginalia or other annotations.
addrLine: (address line) contains one line of a postal address.
address: contains a postal address, for example of a publisher, an organization, or an individual.
adminInfo: (administrative information) contains information about the present custody and availability of the manuscript, and also about the record description itself.
affiliation: (affiliation) contains an informal description of a person's present or past affiliation with some organization, for example an employer or sponsor.
alt: (alternation) identifies an alternation or a set of choices among elements or passages.
altGrp: (alternation group) groups a collection of alt elements and possibly pointers.
altIdentifier: (alternative identifier) contains an alternative or former structured identifier used for a manuscript, such as a former catalogue number.
am: (abbreviation marker) contains a sequence of letters or signs present in an abbreviation which are omitted or replaced in the expanded form of the abbreviation.
anchor: (anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element.
app: (apparatus entry) contains one entry in a critical apparatus, with an optional lemma and at least one reading.
appInfo: (application information) records information about an application which has edited the TEI file.
application: provides information about an application which has acted upon the document.
arc: encodes an arc, the connection from one node to another in a graph.
argument: A formal list or prose description of the topics addressed by a subdivision of a text.
att.ascribed: provides attributes for elements representing speech or action that can be ascribed to a specific individual.
att.canonical: provides attributes which can be used to associate a representation such as a name or title with canonical information about the object being named or referenced.
att.coordinated: elements which can be positioned within a two dimensional coordinate system.
att.damaged: provides attributes describing the nature of any physical damage affecting a reading.
att.datable: provides attributes for normalization of elements that contain dates, times, or datable events.
att.datable.iso: provides attributes for normalization of elements that contain datable events using the ISO 8601 standard.
att.datable.w3c: provides attributes for normalization of elements that contain datable events using the W3C datatypes.
att.declarable: provides attributes for those elements in the TEI Header which may be independently selected by means of the special purpose decls attribute.
att.declaring: provides attributes for elements which may be independently associated with a particular declarable element within the header, thus overriding the inherited default for that element.
att.dimensions: provides attributes for describing the size of physical objects.
att.divLike: provides attributes common to all elements which behave in the same way as divisions.
att.editLike: provides attributes describing the nature of a encoded scholarly intervention or interpretation of any kind.
att.global: provides attributes common to all elements in the TEI encoding scheme.
att.global.analytic: provides additional global attributes for associating specific analyses or interpretations with appropriate portions of a text.
att.global.facs: groups elements corresponding with all or part of an image, because they contain an alternative representation of it, typically but not necessarily a transcription of it.
att.global.linking: defines a set of attributes for hypertext and other linking, which are enabled for all elements when the additional tag set for linking is selected.
att.handFeatures: provides attributes describing aspects of the hand in which a manuscript is written.
att.internetMedia: provides attributes for specifying the type of a computer resource using a standard taxonomy.
att.interpLike: provides attributes for elements which represent a formal analysis or interpretation.
att.measurement: provides attributes to represent a regularized or normalized measurement.
att.msExcerpt: (manuscript excerpt) provides attributes used to describe excerpts from a manuscript placed in a description thereof.
att.naming: provides attributes common to elements which refer to named persons, places, organizations etc.
att.personal: (attributes for components of personal names) common attributes for those elements which form part of a personal name.
att.placement: provides attributes for describing where on the source page or object a textual element appears.
att.pointing: defines a set of attributes used by all elements which point to other elements by means of one or more URI references.
att.pointing.group: defines a set of attributes common to all elements which enclose groups of pointer elements.
att.ranging: provides attributes for describing numerical ranges.
att.rdgPart: attributes for elements which mark the beginning or ending of a fragmentary manuscript or other witness.
att.responsibility: provides attributes indicating who is responsible for something asserted by the markup and the degree of certainty associated with it.
att.segLike: provides attributes for elements used for arbitrary segmentation.
att.sourced: provides attributes identifying the source edition from which some encoded feature derives.
att.spanning: provides attributes for elements which delimit a span of text by pointing mechanisms rather than by enclosing it.
att.tableDecoration: provides attributes used to decorate rows or cells of a table.
att.textCritical: defines a set of attributes common to all elements representing variant readings in text critical work.
att.transcriptional: provides attributes specific to elements encoding authorial or scribal intervention in a text when transcribing manuscript or similar sources.
att.translatable: provides attributes used to indicate the status of a translatable portion of an ODD document.
att.typed: provides attributes which can be used to classify or subclassify elements in any way.
author: in a bibliographic reference, contains the name(s) of the author(s), personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority.
authority: (release authority) supplies the name of a person or other agency responsible for making an electronic file available, other than a publisher or distributor.
availability: supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, etc.
back: (back matter) contains any appendixes, etc. following the main part of a text.
bibl: (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged.
biblFull: (fully-structured bibliographic citation) contains a fully-structured bibliographic citation, in which all components of the TEI file description are present.
biblScope: (scope of citation) defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work.
binding: contains a description of one binding, i.e. type of covering, boards, etc. applied to a manuscript.
bindingDesc: (binding description) describes the present and former bindings of a manuscript, either as a series of paragraphs or as a series of distinct binding elements, one for each binding of the manuscript.
body: (text body) contains the whole body of a single unitary text, excluding any front or back matter.
byline: contains the primary statement of responsibility given for a work on its title page or at the head or end of the work.
cRefPattern: (canonical reference pattern) specifies an expression and replacement pattern for transforming a canonical reference into a URI.
camera: describes a particular camera angle or viewpoint in a screen play.
caption: contains the text of a caption or other text displayed as part of a film script or screenplay.
castGroup: (cast list grouping) groups one or more individual castItem elements within a cast list.
castItem: (cast list item) contains a single entry within a cast list, describing either a single role or a list of non-speaking roles.
castList: (cast list) contains a single cast list or dramatis personae.
catDesc: (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal textDesc.
catRef: (category reference) specifies one or more defined categories within some taxonomy or text typology.
catchwords: describes the system used to ensure correct ordering of the quires making up a codex or incunable, typically by means of annotations at the foot of the page.
category: contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy.
cb: (column break) marks the boundary between one column of a text and the next in a standard reference system.
cell: contains one cell of a table.
certainty: indicates the degree of certainty or uncertainty associated with some aspect of the text markup.
change: documents a particular stage in the genesis of a text.
char: (character) provides descriptive information about a character.
charDecl: (character declarations) provides information about nonstandard characters and glyphs.
charName: (character name) contains the name of a character, expressed following Unicode conventions.
charProp: (character property) provides a name and value for some property of the parent character or glyph.
choice: groups a number of alternative encodings for the same point in a text.
cit: (cited quotation) contains a quotation from some other document, together with a bibliographic reference to its source. In a dictionary it may contain an example text with at least one occurrence of the word form, used in the sense being described, or a translation of the headword, or an example.
classCode: (classification code) contains the classification code used for this text in some standard classification system.
classDecl: (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.
climate: (climate) contains information about the physical climate of a place.
closer: groups together salutations, datelines, and similar phrases appearing as a final group at the end of a division, especially of a letter.
collation: contains a description of how the leaves or bifolia are physically arranged.
collection: contains the name of a collection of manuscripts, not necessarily located within a single repository.
colophon: contains the colophon of a manuscript item: that is, a statement providing information regarding the date, place, agency, or reason for production of the manuscript.
condition: contains a description of the physical condition of the manuscript.
corr: (correction) contains the correct form of a passage apparently erroneous in the copy text.
correction: (correction principles) states how and under what circumstances corrections have been made in the text.
country: (country) contains the name of a geo-political unit, such as a nation, country, colony, or commonwealth, larger than or administratively superior to a region and smaller than a bloc.
creation: contains information about the creation of a text.
custEvent: (custodial event) describes a single event during the custodial history of a manuscript.
custodialHist: (custodial history) contains a description of a manuscript's custodial history, either as running prose or as a series of dated custodial events.
damage: contains an area of damage to the text witness.
damageSpan: (damaged span of text) marks the beginning of a longer sequence of text which is damaged in some way but still legible.
date: contains a date in any format.
dateline: contains a brief description of the place, date, time, etc. of production of a letter, newspaper story, or other work, prefixed or suffixed to it as a kind of heading or trailer.
decoDesc: (decoration description) contains a description of the decoration of a manuscript, either as a sequence of paragraphs, or as a sequence of topically organised decoNote elements.
decoNote: (note on decoration) contains a note describing either a decorative component of a manuscript, or a fairly homogenous class of such components.
del: (deletion) contains a letter, word, or passage deleted, marked as deleted, or otherwise indicated as superfluous or spurious in the copy text by an author, scribe, annotator, or corrector.
delSpan: (deleted span of text) marks the beginning of a longer sequence of text deleted, marked as deleted, or otherwise signaled as superfluous or spurious by an author, scribe, annotator, or corrector.
depth: contains a measurement measured across the spine of a book or codex, or (for other text-bearing objects) perpendicular to the measurement given by the ‘width’ element.
desc: (description) contains a brief description of the object documented by its parent element, including its intended usage, purpose, or application where this is appropriate.
dim: contains any single measurement forming part of a dimensional specification of some sort.
dimensions: contains a dimensional specification.
distinct: identifies any word or phrase which is regarded as linguistically distinct, for example as archaic, technical, dialectal, non-preferred, etc., or as forming part of a sublanguage.
distributor: supplies the name of a person or other agency responsible for the distribution of a text.
div: (text division) contains a subdivision of the front, body, or back of a text.
divGen: (automatically generated text division) indicates the location at which a textual division generated automatically by a text-processing application is to appear.
docAuthor: (document author) contains the name of the author of the document, as given on the title page (often but not always contained in a byline).
docDate: (document date) contains the date of a document, as given (usually) on a title page.
docEdition: (document edition) contains an edition statement as presented on a title page of a document.
docImprint: (document imprint) contains the imprint statement (place and date of publication, publisher name), as given (usually) at the foot of a title page.
docTitle: (document title) contains the title of a document, including all its constituents, as given on a title page.
eLeaf: (leaf or terminal node of an embedding tree) provides explicitly for a leaf of an embedding tree, which may also be encoded with the eTree element.
eTree: (embedding tree) provides an alternative to tree element for representing ordered rooted tree structures.
edition: (edition) describes the particularities of one edition of a text.
editionStmt: (edition statement) groups information relating to one edition of a text.
editor: secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization, (or of several such) acting as editor, compiler, translator, etc.
editorialDecl: (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text.
email: (electronic mail address) contains an e-mail address identifying a location to which e-mail messages can be delivered.
emph: (emphasized) marks words or phrases which are stressed or emphasized for linguistic or rhetorical effect.
encodingDesc: (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived.
epigraph: contains a quotation, anonymous or attributed, appearing at the start of a section or chapter, or on a title page.
epilogue: contains the epilogue to a drama, typically spoken by an actor out of character, possibly in association with a particular performance or venue.
ex: (editorial expansion) contains a sequence of letters added by an editor or transcriber when expanding an abbreviation.
expan: (expansion) contains the expansion of an abbreviation.
explicit: contains the explicit of a manuscript item, that is, the closing words of the text proper, exclusive of any rubric or colophon which might follow it.
extent: describes the approximate size of a text as stored on some carrier medium, whether digital or non-digital, specified in any convenient units.
facsimile: contains a representation of some written source in the form of a set of images rather than as transcribed or encoded text.
figDesc: (description of figure) contains a brief prose description of the appearance or content of a graphic figure, for use when documenting an image without displaying it.
figure: groups elements representing or containing graphic information such as an illustration or figure.
fileDesc: (file description) contains a full bibliographic description of an electronic file.
filiation: contains information concerning the manuscript's filiation, i.e. its relationship to other surviving manuscripts of the same text, its protographs, antigraphs and apographs.
finalRubric: contains the string of words that denotes the end of a text division, often with an assertion as to its author and title, usually set off from the text itself by red ink, by a different size or type of script, or by some other such visual device.
floatingText: contains a single text of any kind, whether unitary or composite, which interrupts the text containing it at any point and after which the surrounding text resumes.
foliation: describes the numbering system or systems used to count the leaves or pages in a codex.
foreign: (foreign) identifies a word or phrase as belonging to some language other than that of the surrounding text.
forest: provides for groups of rooted trees.
forestGrp: (forest group) provides for groups of forests.
formula: contains a mathematical or other formula.
front: (front matter) contains any prefatory matter (headers, title page, prefaces, dedications, etc.) found at the start of a document, before the main body.
funder: (funding body) specifies the name of an individual, institution, or organization responsible for the funding of a project or text.
fw: (forme work) contains a running head (e.g. a header, footer), catchword, or similar material appearing on the current page.
g: (character or glyph) represents a non-standard character or glyph.
gap: (gap) indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible, invisible, or inaudible.
geoDecl: (geographic coordinates declaration) documents the notation and the datum used for geographic coordinates expressed as content of the <geo> element elsewhere within the document.
gloss: identifies a phrase or word used to provide a gloss or definition for some other word or phrase.
glyph: (character glyph) provides descriptive information about a character glyph.
glyphName: (character glyph name) contains the name of a glyph, expressed following Unicode conventions for character names.
graph: encodes a graph, which is a collection of nodes, and arcs which connect the nodes.
graphic: indicates the location of an inline graphic, illustration, or figure.
group: contains the body of a composite text, grouping together a sequence of distinct texts (or groups of such texts) which are regarded as a unit for some purpose, for example the collected works of an author, a sequence of prose essays, etc.
handDesc: (description of hands) contains a description of all the different kinds of writing used in a manuscript.
handNote: (note on hand) describes a particular style or hand distinguished within a manuscript.
handNotes: contains one or more handNote elements documenting the different hands identified within the source texts.
handShift: marks the beginning of a sequence of text written in a new hand, or the beginning of a scribal stint.
height: contains a measurement measured along the axis at right angles to the bottom of the written surface, i.e. parallel to the spine for a codex or book.
heraldry: contains a heraldic formula or phrase, typically found as part of a blazon, coat of arms, etc.
hi: (highlighted) marks a word or phrase as graphically distinct from the surrounding text, for reasons concerning which no claim is made.
history: groups elements describing the full history of a manuscript or manuscript part.
hyphenation: summarizes the way in which hyphenation in a source text has been treated in an encoded version of it.
iNode: (intermediate (or internal) node) represents an intermediate (or internal) node of a tree.
idno: (identifying number) supplies any number or other identifier used to identify a bibliographic item in a standardized way.
imprimatur: contains a formal statement authorizing the publication of a work, sometimes required to appear on a title page or its verso.
incipit: contains the incipit of a manuscript item, that is the opening words of the text proper, exclusive of any rubric which might precede it, of sufficient length to identify the work uniquely; such incipts were, in fomer times, frequently used a means of reference to a work, in place of a title.
index: (index entry) marks a location to be indexed for whatever purpose.
institution: contains the name of an organization such as a university or library, with which a manuscript is identified, generally its holding institution.
interp: (interpretation) summarizes a specific interpretative annotation which can be linked to a span of text.
interpGrp: (interpretation group) collects together a set of related interpretations which share responsibility or type.
interpretation: describes the scope of any analytic or interpretive information added to the text in addition to the transcription.
item: contains one component of a list.
join: identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it.
joinGrp: (join group) groups a collection of join elements and possibly pointers.
keywords: contains a list of keywords or phrases identifying the topic or nature of a text.
l: (verse line) contains a single, possibly incomplete, line of verse.
label: contains the label associated with an item in a list; in glossaries, marks the term being defined.
lacunaEnd: indicates the end of a lacuna in a mostly complete textual witness.
lacunaStart: indicates the beginning of a lacuna in the text of a mostly complete textual witness.
langUsage: (language usage) describes the languages, sublanguages, registers, dialects, etc. represented within a text.
language: characterizes a single language or sublanguage used within a text.
layout: describes how text is laid out on the page, including information about any ruling, pricking, or other evidence of page-preparation techniques.
layoutDesc: (layout description) collects the set of layout descriptions applicable to a manuscript.
lb: (line break) marks the start of a new (typographic) line in some edition or version of a text.
leaf: encodes the leaves (terminal nodes) of a tree.
lem: (lemma) contains the lemma, or base text, of a textual variation.
lg: (line group) contains a group of verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc.
linkGrp: (link group) defines a collection of associations or hypertextual links.
list: (list) contains any sequence of items organized as a list.
listBibl: (citation list) contains a list of bibliographic citations of any kind.
listEvent: (list of events) contains a list of descriptions, each of which provides information about an identifiable event.
listNym: (list of canonical names) contains a list of nyms, that is, standardized names for any thing.
listWit: (witness list) lists definitions for all the witnesses referred to by a critical apparatus, optionally grouped hierarchically.
localName: (locally-defined property name) contains a locally defined name for some property.
locus: defines a location within a manuscript or manuscript part, usually as a (possibly discontinuous) sequence of folio references.
locusGrp: groups a number of locations which together form a distinct but discontinuous item within a manuscript or manuscript part, according to a specific foliation.
m: (morpheme) represents a grammatical morpheme.
macro.limitedContent: (paragraph content) defines the content of prose elements that are not used for transcription of extant materials.
macro.paraContent: (paragraph content) defines the content of paragraphs and similar elements.
macro.phraseSeq: (phrase sequence) defines a sequence of character data and phrase-level elements.
macro.phraseSeq.limited: (limited phrase sequence) defines a sequence of character data and those phrase-level elements that are not typically used for transcribing extant documents.
macro.specialPara: ('special' paragraph content) defines the content model of elements such as notes or list items, which either contain a series of component-level elements or else have the same structure as a paragraph, containing a series of phrase-level and inter-level elements.
macro.xtext: (extended text) defines a sequence of character data and gaiji elements.
mapping: (character mapping) contains one or more characters which are related to the parent character or glyph in some respect, as specified by the type attribute.
material: contains a word or phrase describing the material of which a manuscript (or part of a manuscript) is composed.
measure: contains a word or phrase referring to some quantity of an object or commodity, usually comprising a number, a unit, and a commodity name.
measureGrp: (measure group) contains a group of dimensional specifications which relate to the same object, for example the height and width of a manuscript page.
mentioned: marks words or phrases mentioned, not used.
milestone: marks a boundary point separating any kind of section of a text, typically but not necessarily indicating a point at which some part of a standard reference system changes, where the change is not represented by a structural element.
model.addrPart: groups elements such as names or postal codes which may appear as part of a postal address.
model.addressLike: groups elements used to represent a postal or e-mail address.
model.applicationLike: groups elements used to record application-specific information about a document in its header.
model.biblLike: groups elements containing a bibliographic description.
model.biblPart: groups elements which represent components of a bibliographic description.
model.castItemPart: groups component elements of an entry in a cast list, such as dramatic role or actor's name.
model.catDescPart: groups component elements of the TEI Header Category Description.
model.choicePart: groups elements (other than choice itself) which can be used within a choice alternation.
model.common: groups common chunk- and inter-level elements.
model.dateLike: groups elements containing temporal expressions.
model.dimLike: groups elements which describe a measurement forming part of the physical dimensions of some object.
model.div1Like: groups top-level structural divisions.
model.divBottom: groups elements appearing at the end of a text division.
model.divBottomPart: groups elements which can occur only at the end of a text division.
model.divGenLike: groups elements used to represent a structural division which is generated rather than explicitly present in the source.
model.divLike: groups elements used to represent un-numbered generic structural divisions.
model.divPart: groups paragraph-level elements appearing directly within divisions.
model.divTop: groups elements appearing at the beginning of a text division.
model.divTopPart: groups elements which can occur only at the beginning of a text division.
model.divWrapper: groups elements which can appear at either top or bottom of a textual division.
model.editorialDeclPart: groups elements which may be used inside editorialDecl and appear multiple times.
model.egLike: groups elements containing examples or illustrations.
model.emphLike: groups phrase-level elements which are typographically distinct and to which a specific function can be attributed.
model.encodingDescPart: groups elements which may be used inside encodingDesc and appear multiple times.
model.entryPart: groups elements appearing at any level within a dictionary entry.
model.entryPart.top: groups high level elements within a structured dictionary entry
model.frontPart: groups elements which appear at the level of divisions within front or back matter.
model.frontPart.drama: groups elements which appear at the level of divisions within front or back matter of performance texts only.
model.gLike: groups elements used to represent individual non-Unicode characters or glyphs.
model.global: groups elements which may appear at any point within a TEI text.
model.global.edit: groups globally available elements which perform a specifically editorial function.
model.global.meta: groups globally available elements which describe the status of other elements.
model.glossLike: groups elements which provide an alternative name, explanation, or description for any markup construct.
model.graphicLike: groups elements containing images, formulae, and similar objects.
model.headLike: groups elements used to provide a title or heading at the start of a text division.
model.hiLike: groups phrase-level elements which are typographically distinct but to which no specific function can be attributed.
model.highlighted: groups phrase-level elements which are typographically distinct.
model.imprintPart: groups the bibliographic elements which occur inside imprints.
model.inter: groups elements which can appear either within or between paragraph-like elements.
model.lLike: groups elements representing metrical components such as verse lines.
model.labelLike: groups elements used to gloss or explain other parts of a document.
model.limitedPhrase: groups phrase-level elements excluding those elements primarily intended for transcription of existing sources.
model.listLike: groups list-like elements.
model.measureLike: groups elements which denote a number, a quantity, a measurement, or similar piece of text that conveys some numerical meaning.
model.milestoneLike: groups milestone-style elements used to represent reference systems.
model.msItemPart: groups elements which can appear within a manuscript item description.
model.msQuoteLike: groups elements which represent passages such as titles quoted from a manuscript as a part of its description.
model.nameLike: groups elements which name or refer to a person, place, or organization.
model.nameLike.agent: groups elements which contain names of individuals or corporate bodies.
model.noteLike: groups globally-available note-like elements.
model.pLike: groups paragraph-like elements.
model.pLike.front: groups paragraph-like elements which can occur as direct constituents of front matter.
model.pPart.data: groups phrase-level elements containing names, dates, numbers, measures, and similar data.
model.pPart.edit: groups phrase-level elements for simple editorial correction and transcription.
model.pPart.editorial: groups phrase-level elements for simple editorial interventions that may be useful both in transcribing and in authoring.
model.pPart.msdesc: groups phrase-level elements used in manuscript description.
model.pPart.transcriptional: groups phrase-level elements used for editorial transcription of pre-existing source materials.
model.persStateLike: groups elements describing changeable characteristics of a person which have a definite duration, for example occupation, residence, or name.
model.personPart: groups elements which form part of the description of a person.
model.phrase: groups elements which can occur at the level of individual words or phrases.
model.physDescPart: groups specialised elements forming part of the physical description of a manuscript or similar written source.
model.placeNamePart: groups elements which form part of a place name.
model.placeStateLike: groups elements which describe changing states of a place.
model.placeTraitLike: groups elements which describe unchanging traits of a place.
model.profileDescPart: groups elements which may be used inside profileDesc and appear multiple times.
model.ptrLike: groups elements used for purposes of location and reference.
model.publicationStmtPart: groups elements which may appear within the publicationStmt element of the TEI Header.
model.qLike: groups elements related to highlighting which can appear either within or between chunk-level elements.
model.quoteLike: groups elements used to directly contain quotations.
model.rdgLike: groups elements which contain a single reading, other than the lemma, within a textual variation.
model.rdgPart: groups elements which mark the beginning or ending of a fragmentary manuscript or other witness.
model.resourceLike: groups non-textual elements which may appear together with a header and a text to constitute a TEI document.
model.respLike: groups elements which are used to indicate intellectual or other significant responsibility, for example within a bibliographic element.
model.segLike: groups elements used for arbitrary segmentation.
model.sourceDescPart: groups elements which may be used inside sourceDesc and appear multiple times.
model.stageLike: groups elements containing stage directions or similar things defined by the module for performance texts.
model.teiHeaderPart: groups high level elements which may appear more than once in a TEI Header.
model.titlepagePart: groups elements which can occur as direct constituents of a title page, such as docTitle, docAuthor, docImprint, or epigraph.
move: (movement) marks the actual entrance or exit of one or more characters on stage.
msContents: (manuscript contents) describes the intellectual content of a manuscript or manuscript part, either as a series of paragraphs or as a series of structured manuscript items.
msDesc: (manuscript description) contains a description of a single identifiable manuscript or other text-bearing object.
msIdentifier: (manuscript identifier) contains the information required to identify the manuscript being described.
msItem: (manuscript item) describes an individual work or item within the intellectual content of a manuscript or manuscript part.
msItemStruct: (structured manuscript item) contains a structured description for an individual work or item within the intellectual content of a manuscript or manuscript part.
msName: (alternative name) contains any form of unstructured alternative name used for a manuscript, such as an ‘ocellus nominum’, or nickname.
msPart: (manuscript part) contains information about an originally distinct manuscript or part of a manuscript, now forming part of a composite manuscript.
musicNotation: contains description of type of musical notation.
name: (name, proper noun) contains a proper noun or noun phrase.
node: encodes a node, a possibly labeled point in a graph.
normalization: indicates the extent of normalization or regularization of the original source carried out in converting it to electronic form.
note: contains a note or annotation.
notesStmt: (notes statement) collects together any notes providing information about a text additional to that recorded in other parts of the bibliographic description.
num: (number) contains a number, written in any form.
objectDesc: contains a description of the physical components making up the object which is being described.
opener: groups together dateline, byline, salutation, and similar phrases appearing as a preliminary group at the start of a division, especially of a letter.
orig: (original form) contains a reading which is marked as following the original, rather than being normalized or corrected.
origDate: (origin date) contains any form of date, used to identify the date of origin for a manuscript or manuscript part.
origPlace: (origin place) contains any form of place name, used to identify the place of origin for a manuscript or manuscript part.
origin: contains any descriptive or other information concerning the origin of a manuscript or manuscript part.
p: (paragraph) marks paragraphs in prose.
pb: (page break) marks the boundary between one page of a text and the next in a standard reference system.
pc: (punctuation character) a character or string of characters regarded as constituting a single punctuation mark.
performance: contains a section of front or back matter describing how a dramatic piece is to be performed in general or how it was performed on some specific occasion.
persName: (personal name) contains a proper noun or proper-noun phrase referring to a person, possibly including any or all of the person's forenames, surnames, honorifics, added names, etc.
phr: (phrase) represents a grammatical phrase.
physDesc: (physical description) contains a full physical description of a manuscript or manuscript part, optionally subdivided using more specialised elements from the model.physDescPart class.
placeName: contains an absolute or relative place name.
postscript: contains a postscript, e.g. to a letter.
precision: indicates the numerical accuracy or precision associated with some aspect of the text markup.
principal: (principal researcher) supplies the name of the principal researcher responsible for the creation of an electronic text.
profileDesc: (text-profile description) provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sublanguages used, the situation in which it was produced, the participants and their setting.
projectDesc: (project description) describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected.
prologue: contains the prologue to a drama, typically spoken by an actor out of character, possibly in association with a particular performance or venue.
provenance: contains any descriptive or other information concerning a single identifiable episode during the history of a manuscript or manuscript part, after its creation but before its acquisition.
ptr: (pointer) defines a pointer to another location.
pubPlace: (publication place) contains the name of the place where a bibliographic item was published.
publicationStmt: (publication statement) groups information concerning the publication or distribution of an electronic or other text.
publisher: provides the name of the organization responsible for the publication or distribution of a bibliographic item.
q: (separated from the surrounding text with quotation marks) contains material which is marked as (ostensibly) being somehow different than the surrounding text, for any one of a variety of reasons including, but not limited to: direct speech or thought, technical terms or jargon, authorial distance, quotations from elsewhere, and passages that are mentioned but not used.
quotation: specifies editorial practice adopted with respect to quotation marks in the original.
quote: (quotation) contains a phrase or passage attributed by the narrator or author to some agency external to the text.
rdg: (reading) contains a single reading within a textual variation.
rdgGrp: (reading group) within a textual variation, groups two or more readings perceived to have a genetic relationship or other affinity.
recordHist: (recorded history) provides information about the source and revision status of the parent manuscript description itself.
ref: (reference) defines a reference to another location, possibly modified by additional text or comment.
refState: (reference state) specifies one component of a canonical reference defined by the milestone method.
refsDecl: (references declaration) specifies how canonical references are constructed for this text.
reg: (regularization) contains a reading which has been regularized or normalized in some sense.
relatedItem: contains or references some other bibliographic item which is related to the present one in some specified manner, for example as a constituent or alternative version of it.
relation: (relationship) describes any kind of relationship or linkage amongst a specified group of participants.
relationGrp: (relation group) provides information about relationships identified amongst people, places, and organizations, either informally as prose or as formally expressed relation links.
repository: contains the name of a repository within which manuscripts are stored, possibly forming part of an institution.
resp: (responsibility) contains a phrase describing the nature of a person's intellectual responsibility.
respStmt: (statement of responsibility) supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply.
respons: (responsibility) identifies the individual(s) responsible for some aspect of the markup of particular element(s).
restore: indicates restoration of text to an earlier state by cancellation of an editorial or authorial marking or instruction.
revisionDesc: (revision description) summarizes the revision history for a file.
role: the name of a dramatic role, as given in a cast list.
roleDesc: (role description) describes a character's role in a drama.
root: (root node) represents the root node of a tree.
row: contains one row of a table.
rs: (referencing string) contains a general purpose name or referring string.
rubric: contains the text of any rubric or heading attached to a particular manuscript item, that is, a string of words through which a manuscript signals the beginning of a text division, often with an assertion as to its author and title, which is in some way set off from the text itself, usually in red ink, or by use of different size or type of script, or some other such visual device.
s: (s-unit) contains a sentence-like division of a text.
said: (speech or thought) indicates passages thought or spoken aloud, whether explicitly indicated in the source or not, whether directly or indirectly reported, whether by real people or fictional characters.
salute: (salutation) contains a salutation or greeting prefixed to a foreword, dedicatory epistle, or other division of a text, or the salutation in the closing of a letter, preface, etc.
samplingDecl: (sampling declaration) contains a prose description of the rationale and methods used in sampling texts in the creation of a corpus or collection.
seal: contains a description of one seal or similar attachment applied to a manuscript.
sealDesc: (seal description) describes the seals or other external items attached to a manuscript, either as a series of paragraphs or as a series of distinct seal elements, possibly with additional decoNotes.
secFol: (second folio) The word or words taken from a fixed point in a codex (typically the beginning of the second leaf) in order to provide a unique identifier for it.
seg: (arbitrary segment) represents any segmentation of text below the ‘chunk’ level.
segmentation: describes the principles according to which the text has been segmented, for example into sentences, tone-units, graphemic strata, etc.
series: (series information) contains information about the series in which a book or other bibliographic item has appeared.
seriesStmt: (series statement) groups information about the series, if any, to which a publication belongs.
set: (setting) contains a description of the setting, time, locale, appearance, etc., of the action of a play, typically found in the front matter of a printed performance text (not a stage direction).
settlement: contains the name of a settlement such as a city, town, or village identified as a single geo-political or administrative unit.
sic: (latin for thus or so ) contains text reproduced although apparently incorrect or inaccurate.
signatures: contains discussion of the leaf or quire signatures found within a codex.
signed: (signature) contains the closing salutation, etc., appended to a foreword, dedicatory epistle, or other division of a text.
soCalled: contains a word or phrase for which the author or narrator indicates a disclaiming of responsibility, for example by the use of scare quotes or italics.
sound: describes a sound effect or musical sequence specified within a screen play or radio script.
source: describes the original source for the information contained with a manuscript description.
sourceDesc: (source description) describes the source from which an electronic text was derived or generated, typically a bibliographic description in the case of a digitized text, or a phrase such as "born digital" for a text which has no previous existence.
sp: (speech) An individual speech in a performance text, or a passage presented as such in a prose or verse text.
space: indicates the location of a significant space in the copy text.
span: associates an interpretative annotation directly with a span of text.
spanGrp: (span group) collects together span tags.
speaker: A specialized form of heading or label, giving the name of one or more speakers in a dramatic text or fragment.
stage: (stage direction) contains any kind of stage direction within a dramatic text or fragment.
stamp: contains a word or phrase describing a stamp or similar device.
subst: (substitution) groups one or more deletions with one or more additions when the combination is to be regarded as a single intervention in the text.
summary: contains a brief summary of the intellectual content of an item, provided by the cataloguer.
supplied: signifies text supplied by the transcriber or editor for any reason, typically because the original cannot be read because of physical damage or loss to the original.
support: contains a description of the materials etc. which make up the physical support for the written part of a manuscript.
supportDesc: (support description) groups elements describing the physical support for the written part of a manuscript.
surface: defines a written surface in terms of a rectangular coordinate space, optionally grouping one or more graphic representations of that space, and rectangular zones of interest within it.
surplus: marks text present in the source which the editor believes to be superfluous or redundant.
surrogates: contains information about any non-digital representations of the manuscript being described which may exist in the holding institution or elsewhere.
taxonomy: defines a typology used to classify texts either implicitly, by means of a bibliographic citation, or explicitly by a structured taxonomy.
tech: (technical stage direction) describes a special-purpose stage direction that is not meant for the actors.
teiCorpus: contains the whole of a TEI encoded corpus, comprising a single corpus header and one or more TEI elements, each containing a single text header and a text.
teiHeader: (TEI Header) supplies the descriptive and declarative information making up an electronic title page prefixed to every TEI-conformant text.
term: contains a single-word, multi-word, or symbolic designation which is regarded as a technical term.
text: contains a single text of any kind, whether unitary or composite, for example a poem or drama, a collection of essays, a novel, a dictionary, or a corpus sample.
textClass: (text classification) groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc.
textLang: (text language) in a manuscript description, describes the languages and writing systems identified within the manuscript being described.
time: contains a phrase defining a time of day in any format.
timeline: (timeline) provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text.
title: contains a title for any kind of work.
titlePage: (title page) contains the title page of a text, appearing within the front or back matter.
titlePart: contains a subsection or division of the title of a work, as indicated on a title page.
titleStmt: (title statement) groups information about the title of a work and those responsible for its intellectual content.
trailer: contains a closing title or footer appearing at the end of a division of a text.
tree: encodes a tree, which is made up of a root, internal nodes, leaves, and arcs from root to leaves.
triangle: (underspecified embedding tree, so called because of its characteristic shape when drawn) Provides for an underspecified eTree, that is, an eTree with information left out.
typeDesc: contains a description of the typefaces or other aspects of the printing of an incunable or other printed source.
typeNote: describes a particular font or other significant typographic feature distinguished within the description of a printed resource.
unclear: contains a word, phrase, or passage which cannot be transcribed with certainty because it is illegible or inaudible in the source.
unicodeName: (unicode property name) contains the name of a registered Unicode normative or informative property.
value: (value) contains a single value for some property, attribute, or other analysis.
variantEncoding: declares the method used to encode text-critical variants.
view: describes the visual context of some part of a screen play in terms of what the spectator sees, generally independent of any dialogue.
watermark: contains a word or phrase describing a watermark or similar device.
when: indicates a point in time either relative to other elements in the same timeline tag, or absolutely.
width: contains a measurement measured along the axis parallel to the bottom of the written surface, i.e. perpendicular to the spine of a book or codex.
wit: contains a list of one or more sigla of witnesses attesting a given reading, in a textual variation.
witDetail: (witness detail) gives further information about a particular witness, or witnesses, to a particular reading.
witEnd: (fragmented witness end) indicates the end, or suspension, of the text of a fragmentary witness.
witStart: (fragmented witness start) indicates the beginning, or resumption, of the text of a fragmentary witness.
witness: contains either a description of a single witness referred to within the critical apparatus, or a list of witnesses which is to be referred to by a single sigil.
zone: defines a rectangular area contained within a surface element.
Notes
1.
See also the TEI’s (implicit) position on this point: ‘we define markup, or (synonymously) encoding, as any means of making explicit an interpretation of a text’ (TEI Guidelines: v. A Gentle Introduction to XML). See also reference to Robinson and Solopova 1993/1997: 21: ‘Any primary textual source… has its own semiotic system within it.[…] The two semiotic system are materially distinct, in that text written by hand is not the same as the text on the computer screen’.
2.
As in the case where a document describes the ordering of parts of a text contained in another document. It is the case, for instance of Beckett's That Time where the speeches of A, B and C are obsessively first subdivided and subsequently shuffled and reshuffled by the means of sequences of letters and numbers contained in a number of documents.
3.
Manzoni, for instance, used to modify an old draft to see how a new variant fitted with the context before copying it into a new draft.
4.
5.
This doesnt get mentioned again, and doesn't seem to be very relevant to me. Cut it? (LB)
6.
An early proposal to use such a formalism for handling textual variation is Sperberg-McQueen, C.M. (1989). A directed-graph data structure for text manipulation. In: ICCH/ALLC Conference. The Dynamic Text at the University of Toronto. http://www.w3.org/People/cmsmcq/1989/rhine-delta-abstract.html . See also Desmond Schmidt Robert Colomb (2009): A data structure for representing multi-version texts online (in International Journal of Human-Computer Studies, Volume 67 , Issue 6 (June 2009) 497-514
Notes
1. (Leaves81-82) (Leaves56)


Lou Burnard, Fotis Jannidis, Elena Pierazzo, Malte Rehbein. Date: Revised Draft