Appendix 2: A TEI conformant tagset for manuscript description

This appendix provides a more detailed description of the TEI extensions already introduced in section above. It is, of necessity, more technical than this earlier discussion, and assumes a more detailed knowledge of the TEI scheme and of the syntax of SGML on the part of the reader. In particular, it omits any discussion or explanation of SGML and of the particular element definitions which currently constitute the TEI Recommendations. (A good introductory collection of papers on the TEI is provided by Ide and Veronis (1996), and up to date information is available from the TEI's web site at

Basic principles

The goal of this scheme is to provide a framework which the cataloguer can use to model the structure of a manuscript. Unlike, for example, the Encoded Archival Description, which models the finding aid for a manuscript, but not the manuscript itself, our DTD is intended to facilitate the production of an SGML document within which either a full transcription or a set of page images of a manuscript can be embedded, or with which such things can readily be linked. The manuscript description (as opposed to the manuscript) must therefore be treated as metadata.

A further assumption underlying our proposals is that we should depart from the proposals of the TEI only where these are demonstrably inadequate to our needs. In practice, we rely heavily on the basic structure and content of the TEI Guidelines (Sperberg-McQueen and Burnard 1994) for the bulk of our encoding scheme. This has two implications: firstly, our manuscript description is located firmly within the existing metadata framework provided by the TEI's header; secondly, we do not here describe in any detail those parts of the TEI scheme which we use unchanged.

The TEI requires that metadata be encoded within a special element called the TEI Header (documented in chapter 5 of TEI P3). For documents which consist only of metadata (such as collections of manuscript descriptions), two mechanisms are possible: either a free standing DTD known as the Independent Header (see chapter 24 of P3) may be used, or each set of metadata may be embedded within the standard TEI biblFull element, which can then be used within the body of a text in the same way as the other bibliographic elements defined in the TEI scheme.

The TEI system includes a modification mechanism, which has been followed in defining the encoding scheme described here. This scheme assumes that the TEI's base tagset for prose, together with the additional tagsets for figures and for linking are in use. A TEI conformant document using this system will therefore begin with a set of declarations like the following: ] > ]]> These declarations associate the system files wmss.dtd and wmss.ent with the parameter entities TEI.extensions.dtd and TEI.extensions.ent respectively, as further discussed in chapter 4 of TEI P3. The following kinds of modification are accomplished by these declarations: definition of the mssStmt element: an additional, richly detailed, component within the standard TEI Header (see section ) definition of an additional global attribute called range, used to specify the location of any part of a manuscript description, in terms of a single consistent foliation (see section ) definition of a chunk-level surrogate or summary element (see section ) definition of some specialized phrase-level heading elements (see section ) At present, we have found no need to remove any of the element definitions included in the TEI prose base extended by the additional tagsets for linking and images.

The Manuscript Statement

The manuscript statement is embedded as a further alternative component of the sourceDesc element, within an otherwise unmodified teiHeader element. The mssStmt element (by analogy with recordingStmt) contains a full description of a manuscript, that is, any object bearing handwriting or inscription. A description should be distinguished from a transcription or an edition: although it may mimic the structure of the object described, and may thus serve as a skeleton for the construction of a transcription or a set of images of a manuscript, a description remains a bibliographic substitute for an object, and as such belongs within the TEI Header.

Since manuscripts in major collections are often composed of several originally quite distinct physical components, the mssStmt element may contain a number of distinct descriptions for component parts, as an alternative, or in addition, to the description of the whole manuscript. A distinct mssStmt element should be used for each such component part, and embedded within that for the manuscript as a whole. For example, the mssStmt for a codex made up of three medieval manuscripts of different date which have subsequently been bound together, would contain a series of elements describing the whole codex, and three embedded mssStmt elements, each of which describes features of its components.

Nested manuscript descriptions do not need to repeat elements which have already been specified by their parent. For example, if the leaves element at the outermost level of a mssStmt specifies that a codex contains parchment leaves, it is assumed that all of its components are parchment, unless a further leaves element appears within a nested mssStmt.

groups elements describing the physical, decorative, and other features of a manuscript or manuscript-part. It carries an additional attribute: specifies the format of the manuscript or manuscript part being described. Legal values are defined by the formatTypes parameter entity, for the Bodleian application defaulting to: description relates to a complete codex. description relates to a roll. description relates to a single leaf. description relates to a leaf-fragment. description relates to an incomplete codex. description relates to a leaf-fragment deliberately excised.

The components of the mssStmt element are listed below. The same component elements are used, whether the description relates to a whole manuscript or a part. If a given element appears both for a part and for a whole, information at the part level is understood to complement or over-ride information at the whole. If no information of a given kind is present at the part level, information of the same kind specified within its immediate parent mssStmt is applicable.

The following elements may appear within a mssStmt element, relating either to the whole of a manuscript, or to some distinct part within it. groups elements describing decorative aspects of a manuscript or manuscript part. defines a set of measured dimensions within a manuscript or manuscript part. describes the physical make-up of a manuscript or manuscript part, specifically, the leaves of which it is composed, the material of which these are composed, and any damage. describes one or more numbering systems used to refer to individual parts of a manuscript or manuscript part, typically based on its foliation or pagination. contains a formula describing the single leaves, bifolia, and quires (or gatherings) of which a manuscript or manuscript part is composed. contains one or more scriptNote elements, describing the script or writing system used in a manuscript or manuscript part. contains a description of the rubrication applied to a manuscript or manuscript part, i.e. the use of red ink or other mechanisms to indicate headings, etc. supplies a kind of fingerprint (secundo folio) used to identify a given manuscript or manuscript part. contains a description of some stage in a manuscript's ownership or history. contains an ordered sequence of provenance elements. contains information relating to the present or former binding of a manuscript or manuscript part.

The decoration element is used to group together information relating to decorative aspects (e.g. historiated or decorated initials, miniatures, borders etc.) of a manuscript or manuscript part. Its components and usage are discussed in section . More than one such element may be provided for a given manuscript or manuscript part, in a case where there distinct decorative campaigns have been identified.

Each of the possibly many dim elements within a manuscript description defines a set of measurements for one or more of its leaves, as further discussed in section .

The optional leaves, foliation, and collation elements and their components are further discussed in section below. The scriptDesc, rubrication, and secFol elements and their components are further discussed in section . The provenance and listProvenance element are further discussed in section below. The binding element is further discussed below in section .

The mssStmtelement is formally defined as follows: ]]>


The following elements are available within the decoration element. Any or all of them may be used to provide detailed information about particular decorative aspects of the manuscript or manuscript part. contains a summary overview of the decorative scheme, typically including the total number of miniatures of different sizes, an indication of any missing or excised decoration, and other descriptive comments. describes any independant (free-standing) illustrations. (i.e. historiated initials) describes initials which depict scenes or events, or contain persons or animals (even if they are unidentified), whose representational function can be considered narrative, or at least more than purely decorative. (i.e. decorated initials) describes initials which are decorated in a non-narrative way. describes decoration occupying one or more margins of the page, which is neither dependent on an initial for its structure or shape, nor a discreet miniature, as defined above. describes any minor decoration not covered by the elements above. Contains commentary on the decoration, such as an attribution to a named artist, or discussion of the evidence for excised decoration.

All of these elements have the same sub-components: a sequence of one or more paragraphs, tagged using the TEI p element. Here is an example of the use of the overview element:

The decoration comprises two full page miniatures, perhaps added by the original owner, or slightly later; the original major decoration consists of twenty-three large miniatures, illustrating the divisions of the Passion narrative and the start of the major texts, and the major divisions of the Hours; seventeen smaller miniatures, illustrating the suffrages to saints; and seven historiated initials, illustrating the pericopes and major prayers.]]>

In a manuscript with little decorative material, an overview of the above kind may be all that is needed. Often however the cataloguer may wish to describe the decoration in greater detail, using the appropriate sub-elements defined above. Whichever of these sub-elements of decoration are used, features will most naturally be described in the order of a descending hierarchy, based on such aspects as size, complexity, colour, or materials.

Such hierarchies may conveniently be represented using the standard TEI list element, embedded within a paragraph. The range attribute (discussed in section below) may be used to link each component item to the folio on which it appears. The iconTerm element discussed in section below may be used to identify particular scenes, events, persons, animals, and objects represented, as in the following example:

Fourteen large miniatures with arched tops, above five lines of text: Pericopes. St. John writing on Patmos, with the Eagle holding his ink-pot and pen-case; some flaking of pigment, especially in the sky Hours of the Virgin, Matins. Annunciation; Gabriel and the Dove to the right Prime. Nativity; the Virgin and Joseph adoring the Child Terce. Annunciation to the Shepherds, one with bagpipes


Historiated initials may be described in much the same way, using the historInit element as follows:

Seven historiated initials, six or seven lines high, illustrating pericopes and prayers: Pericope of Luke. St. Luke writing, the Ox beside him Pericope of Matthew. St. Matthew writing, the Angel holding a book open Pericope of Mark. St. Mark writing (left-handed), the Lion beside him Obsecro te. The Virgin of the Apocalypse, holding the Christ Child, standing on a crescent moon O intemerata. The Virgin reading a book, with St. Joseph(?) Stabat mater. Pietà, in front of the Cross Missus est Gabriel. The Annunciation


The distinction between historiated and decorated initials is that the former typically depict scenes, events, persons, or animals, whose representational function can be considered to be more than purely decorative, while the latter is for use where initials are decorated in a non-narrative manner. In cases of uncertainty as to whether an image was intended to have more than a purely decorative function, (for example hybrid creatures, dragons, or other similar stock motifs), the decorInit element should be preferred.

The following example describes a typical set of decorated (but not historiated) initials:

Four- or three-line initials in blue and red, enclosing foliage, on a gold ground, at the start of each text with a large miniature; two-line initals in gold, on a blue and red ground with white tracery, to psalms, capitula, lessons, etc. and the KL monograms in the Calendar; similar one-line initials to verses and other minor divisions.


An example of the use of the minorDec element follows. Note that this element should not be used to document such decorative aspects as the use of coloured ink for a paragraph mark, or a calligraphic flourish of the pen as part of the handwriting, which would be better described using the rubrication and script elements, respectively.

Line fillers similar to one-line initials except that line-fillers from fol.185r-193v are of a stylistically later type than the rest, and use only painted gold, on an alternately blue or red square ground.


Similar considerations apply to the use of the borderelement, as in the following example:

The large miniatures and the Lauds initial surrounded by four-sided framed borders of stylised foliage on a plain parchment ground, and variously-shaped panels of naturalistic plants on a painted gold ground; the small miniatures and five-line historiated initials surrounded by similar three-sided borders (in the outer margins); similar one-sided border panels on all pages with a two-line initial


The following example demonstrates the use of a list of distinct attributive commentaries relating to different parts of a manuscript, on which at least four main artists have worked.

At least four main artists worked on the book, the division of their work corresponding to the sections written by the three main scribes (see under Script): Most of the main body of the book (up to fol.182v) was painted and decorated in one style, having links in style and iconography with the school of Maître Francois, although several of the miniatures in this section have been damaged and overpainted at a later date (e.g. the figure of Christ on fol. 33r; the face of the Shepherdess on fol. 59v, etc.). Within this first section, the miniature at the start of the Hours of the Virgin (fol. 34v), is in another style, more suggestive of the early 16th. cent. than the late 15th. The border on fol. 184r is of the same type as those which precede, but the facing minature appears to be by a later, less able, artist, who was perhaps also responsible for the coat of arms on fol.184v. Finally, the miniature on fol. 185r is by a hand working in the style of Jean Bourdichon: his more sophisticated use of gold highlights, his more subtle modelling, and his treatment of the landscape and the framing architecture set him apart from the miniatures which precede.


The decoration element and its components are defined as follows: ]]>

Dimensions and areas

Manuscripts and manuscript parts may be measured in a number of different ways, using different units. Each set of measurements given for a particular part defines a distinct dim element, defined using the following elements. defines a measured set of dimensions within a manuscript or manuscript part. Attributes include: indicates the kind of area being defined. Legal values are specified by the dimType parameter entity; for the Bodleian application, proposed values are: dimensions relate to a single leaf. dimensions relate to a distinct group of leaves (e.g. a gathering, or a separately bound part) dimensions relate to the area of a leaf which has been ruled in preparation for writing. dimensions relate to the area of a leaf which has been pricked out in preparation for ruling (used where this differs significantly from the ruled area, or where the ruling is not measurable). dimensions relate to the area of a leaf which has been written, with the height measured from the top of the minims on the top line of writing, to the bottom of the minims on the bottom line of writing. dimensions relate to the box or other container in which the whole codex or manuscript is stored.

At its simplest, a dim element may contain only a brief description, as in the following example:

Ruled in red ink for 18 lines of text per page

]]>More usually, however, the description will include measurements of height, width, or depth, or all three, using the following elements: specifies the dimension measured along the axis parallel to the spine. specifies the dimension measured along the axis at right angles to the spine. specifies the dimension measured along the axis perpendicular to the spine.

For each of the above elements, the following attributes may be supplied: specifies the units used for this measurement. The default, and recommended, value is mm, i.e. millimetres. specifies the applicability of this measurement to the containing manuscript or manuscript part. Legal values are: measurement applies to all instances in this manuscript or manuscript part. measurement applies to most instances in this manuscript or manuscript part. all the instances measured in this manuscript or manuscript part are within the range specified

The following example indicates that the leaves of this manuscript range from 157 to 160 mm in height and are all 105 mm in width, while the ruled area on most pages is approximately 90 x 48 mm. 157-160 105 90 48 ]]>

These elements are defined as follows: ]]>

Leaves, material, foliation, and collation

Manuscripts are composed of leaves or sheets of materials such as paper or vellum, which are numbered or foliated according to one or more schemes, and which are bound together physically into one or more groups of gatherings or quires. There are various ways of expressing the arrangement of these quires, using numbers to represent each quire and the number of leaves of which it ias composed: this expression of the structure is called the collation. In this subsection we define elements which may be used to record information about each of these aspects.


groups information about the leaves or sheets of material used to form a manuscript, whether or not intended as a surface for writing. Each leaf has two sides (i.e. two pages), recto and verso.

A distinction is made between the body leaves of a manuscript, and the fly leaves which may precede or follow them: a flyleaf is here defined as any leaf which was originally left completely or almost completely blank at the beginning or end of a manuscript, regardless of whether it was subsequently used for writing or decoration. The following elements are used to define them: contains a description of the number and material of flyleaves of a manuscript contains a description of the number of leaves of the main, written, part of a manuscript, which may contain blank leaves within it (these are not flyleaves).

Attributes for both flyLeaves and bodyLeaves elements include: states the number of leaves

For example, a manuscript containing two fly leaves, followed by 40 body leaves, and two further fly leaves might be described as follows: ]]>

Alternatively, more information about the leaves making up the manuscript might be supplied, using any of the following elements repeated as often as necessary to describe the materials of which leaves are composed, any watermarks associated with them, and any damage they may have suffered: describes the material of which leaves are composed. Attributes include: supplies a normalized form of name for the kind of material used. Legal values are defined by the materialTypes parameter entity. For leaves, the following values are proposed: paper. parchment or vellum. other. unknown. describes any watermark present. describes any damage to the leaves, such as water-staining, burning, rodent damage, excised margins, etc.

A brief description only may be given, enclosed within a paragraph tag, as in the flyleaves of the example below. More usually, at least a materialelement will be used, as in the body leaves below:

modern paper, the first conjoint with the pastedown

Parchment, often with a marked difference between hair and flesh sides, often with irregular edges, arranged with the spine of the animal running horizontally

modern paper, the second conjoint with the pastedown ]]> These elements are defined as follows ]]>


The term foliation refers to the method used to number the leaves of a manuscript, in order to give each one a unique identifier. It is customary to use the same folio number for the front and back side of a leaf, distinguishing the two by addition of the words recto, for the front side, and verso for the back side (usually abbreviated to r and v respectively) should be specified. When front and back sides of a leaf are given different numbers, the manuscript is said to be paginated.

For descriptive purposes, it may be useful to record several different sets of foliation or pagination for the same manuscript. One of these must be used consistently as the reference foliation system used by the range attribute discussed in section

describes one or more of the numbering systems (usually foliation or pagination) applied to a manuscript. Attributes include Indicates whether this foliation is contemporary with the manuscript's contents or not. Legal values are Foliation is original Foliation is not original Date of foliation is unknown

This element may simply contain a prose description enclosed in one or more paragraphs, as in the following example:

Fols. 1-34 with near-contemporary(?) ink foliation in Arabic numerals

the front flyleaf and the remaining folios with modern pencil i and 35-260 ]]>

Alternatively, more specific detail may be given using one or more of the following elements: describes the historical period during which the numbering system was applied to a manuscript. describes the medium (for example, ink, pencil, pigment, drypoint) by means of which the numbering system was applied to a manuscript. describes the numbering system (for example, upper/lower-case roman numerals, arabic numerals, etc.) which has been applied to a manuscript.

These elements are defined as follows ]]>


The term collation is used to describe the manner in which individual groups of leaves and bifolia are organised into quires or gatherings, including details of the sequence and size of these quires, and indications of added or missing leaves. The collation is usually expressed as a formula.

The following elements are provided to record collation information: describes the make-up of a manuscript, in terms of a quire formula, defining its single leaves, bifolia, and quires, and the evidence supporting the formula. contains a quire formula, expressed as a group of consecutive quireSequence elements, which is distinct in some significant way from those that precede or follow it contains a single quire, or a run of consecutive quires of the same size.

For example, the following collation description indicates that there are two distinct groups of gatherings, the first (corresponding with the modern foliation 1 to 13) being composed of a single 12 leaf gathering to which an additional leaf was added at the start; and the second (corresponding with modern folios 14 to 26) being composed of three gatherings, composed of six, four, and two leaves respectively. 1st leaf added, fol. 1 ]]>

The quire at the start of a new quireGroup will normally coincide with some new physical feature, scribe, artist, etc. There is no requirement that all quires of the same size necessarily form a single quireGroup.

The following example indicates that the flyleaves and pastedowns at both the front and back of the volume are original; and that the main body of the MS. consists of three discernable internal groupings, each defined by the fact that textual and physical divisions of the volume correspond with one another; the first of these occupies fols. 1-138, and consists of 13 quires (quires I-XIII, fols. 1-130) each composed of 10 leaves, and one quire (quire XIV, fols. 131-138) of 8 leaves; the second textual-structural unit occupies fols. 139-238, and consists 10 quires (quires XV-XXIV), of each of 10 leaves; the third textual-structural unit occupies fols. 239-258, and consists of 2 quires (quires XXV-XXVI) each of 10 leaves:

The written leaves preceded by an original flyleaf, conjoint with the pastedown

fols. 259-60 are two further original flyleaves, conjoint with two pasted-down leaves ]]>

The evidence element is used to present the evidence relating to the collation. It may contain either a sequence of paragraph elements, a discussion of the inscribed marks supporting the formula, or both, using the following elements: groups elements providing relating to the collation. contains a description of collational evidence which takes the form of inscribed marks. Attributes include: specifies the form of the marking. Suggested legal values are evidence provided by catchwords evidence provided by signatures applied to each leaf evidence provided by signatures applied to each quire other form of evidence indicates whether the markings are present throughout the manuscript, or only in part of it. Suggested legal values are: all quires (or leaves etc.) bear the specified information only some quires (or leaves etc.) bear the specified information specifies the physical placement of the markings on the leaf or quire. indicates the orientation of a marking marking is horizontal marking is to be read vertically upwards marking is to be read vertically downwards

In the following example, the only evidence for the quire formula cited is the presence of one surviving catchword.

a trace of a catchword survives at fol. 127v


In the following example, more detailed evidence of catchwords is supplied: Catchwords are present at the end of all quires except XIV, XVI, XXIV, and XXVI, written horizontally near the centre of the lower margin, in the same hand as the main text, usually between two dots ]]>

These elements are defined as follows ]]>

Script, rubrication, and secundo folio

The three elements defined in this subsection are used to describe aspects of the way in which a manuscript is written: the kind of script used, the presence of any rubrication, and any identification information provided by the secundo folio.


The script in which a manuscript or ms part is written may be described using the following elements. Note that the TEI Guidelines propose that a definitive list of the identified scripts or hands used in a manuscript should be supplied as a handList element within the profileDesc element of a standard TEI Header. The hand elements defined there do not however permit of the additional information proposed here; however, to retain compatibility it is recommended that scriptNote elements should be linked with a corresponding handelement (using the hand attribute on the former) wherever possible. contains one or more scriptNote elements contains a description of the handwriting of the manuscript. Attributes include supplies the identifier of a hand element elsewhere in the TEI Header containing further information about the script being described by this note.

written in a gothic liturgical bookhand in two sizes, according to liturgical function


These elements are defined as follows ]]>


The term rubrication is used here both literally, to refer to text written in red, and more generally to refer to any text distinguished from the surrounding script by use of colour or other marks such as paragraph marks, run over symbols etc. A description of the rubrication practice used in a manuscript or manuscript-part may be supplied using the following element: describes any rubrics or headings, distinguished from the surrounding script by the use of colour or other devices

A typical example of the use of this element might be:

headings in red, capitals touched with a yellow wash

headings, occasional paragraphs and underlinings in red, capitals touched in red; guides for rubrics often visible ]]>

This element is defined as follows: ]]>

Secundo folio

In the Middle ages and beyond, a commonly used means of distinguishing one copy of a text from another was to note the words beginning at a specific point in the text, generally the start of the second leaf, since no two handwritten copies of a given text are likely to have reached exactly the same point in the text after writing both sides of the first leaf. This practice makes it possible to match surviving manuscripts in modern libraries with references to them in medieval inventories and library catalogues.

The following element is provided to enable the recording of secundo folio: Contains the secundo folio of a manuscript; i.e. the first word or words on the second folio. Attributes include contains an indication of the context from which the secFol has been taken.The global range attribute should be used to specify the actual location of the secundo folio in terms of the standard foliation used for the manuscript. The locus may of course be the true second folio of a whole volume, or it may ignore coeval prefatory material, or other preceding material, or it may take account of a missing first folio, etc. For example, the following secFol indicates that the words venientem in appear at the beginning of what was originally the second folio of the text part of this manuscript, but which is now the 15th leaf:venientem in]]>

More than one element may be supplied, as in the following example. Here, the true second leaf (fol. 2r), which falls within added material, starts te dilecto; the second leaf of the original manuscript (fol. 5r) falls within a Calendar, and starts KL Sporkelle; while the second leaf of the main text of the original manuscript (fol. 18r) starts sijn volc want: te dilecto KL Sporkelle sijn volc want ]]>

This element is defined as follows ]]>

Provenance information

The listProvenance element is used to group one or more provenance elements, usually in chronological order (most recent last). contains a brief description of some stage in a manuscript's ownership or history.

A provenance element may contain a short prose note, as in this example:

Perhaps made for use in Paris, and presumably still there when bound in the late 16th-cent.; fol. 1 may have been added at the same time

Alternatively, a provenance element may contain detailed information about (for example) the former owner of a manuscript, together with a brief summary of the evidence for this ascription, as in the following example:

Rt. Hon. T. R. Buchanan probably by 1874; inscribed by him (?) in pencil 10. in the top left corner of the upper pastedown; given to the Bodleian in 1939 by his widow, Mrs. E. O. Buchanan.]]>

A series of detailed provenance elements may be supplied, as in the following example:

Unidentified 18th/19th-cent. owner: inscribed in black ink in the top left corner of the upper pastedown No. 12.; the same hand wrote No. 19. in the same place in London, BL, Egerton MS. 3271 (see 4., below).

M. A. van der Linde, before 1864: Catalogue de la bibliothéque de M. A. van der Linde ... La vente aura lieu du 7 au 16 avril 1864 ... à la libraire G. -A. van trigt, Bruxelles, 1864, lot 202.

Rt. Hon. T. R. Buchanan, probably after 1874, but before 1891 (when it was exhibited at the Burlington Fine Arts Club); given to the Bodleian by his widow, Mrs. E. O. Buchanan, in 1941; inscribed in pencil D[onated]. 28.v.1941 with the present shelfmark on fol. 1r, and with the shelfmark alone on the front pastedown. ]]>

These elements are defined as follows ]]>

Binding description

The binding element contains a description of the state of the present and former bindings of a manuscript, including information about its material, any distinctive marks, and provenance information.

Sewing not visible; tightly rebound over 19th-cent. pasteboards, reusing panels of 16th-cent. brown leather with gilt tooling à la fanfare, Paris c. 1580-90, the centre of each cover inlaid with a 17th-cent. oval medallion of red morocco tooled in gilt (perhaps replacing the identifying mark of a previous owner); the spine similarly tooled, without raised bands or title-piece; coloured endbands; the edges of the leaves and boards gilt.Boxed.


Considerable additional work is needed to specify more detailed information relating to the components and format of bindings. At present, this element is defined as follows: ]]>