1 Extended Pointers

1 Extended Pointers

The TEI scheme defines two generic pointer elements which support both inter and intra document linkage: <xptr> and <xref>. The only difference between them is that the former is empty, while the latter can contain phrase-level elements or PCDATA. The content of an <xref> is typically a string indicating how the link is to be rendered at the source end.

These elements share the following attributes, which are used to specify the target of the cross reference or link:

An <xptr> (or <xref>) may point to the whole of some other entity simply by supplying its name as the value of the doc attribute:

<!-- ... -->
see <xref doc=TEIP3>The TEI Guidelines, passim</xref>

This example assumes that some system or public entity with the name TEIP3 has been declared.

The from attribute is used to specify a location within the document specified by the doc attribute, using the TEI extended pointer syntax. In this language, locations are defined as a series of steps, each one identifying some part of the document, often with respect to locations identified in a previous step. For example, you would point to the third sentence of the fourth paragraph of chapter two by selecting chapter two in the first step, the fourth paragraph in the second step, and the third sentence in the last step. A step can be defined in terms of the SGML tree (using such keywords as parent, descendent, preceding, etc.) or, more loosely, in terms of text patterns, word or character positions. You can also use a foreign (non-SGML) notation, or specify a location within a graphic in terms of its co-ordinate system.

The from and to attributes use the same notation. Each points to a location within the target document; the target of the extended pointer is the whole sequence beginning at the start of the location indicated by the from attribute, and running to the end of the location indicated by the to attribute.

The first step in a location path will often be to specify the identifier of some element within the target document, as in this example:

<xptr doc=TEIP3 from='id (SA)'>
This selects the whole of whatever element bears the identifier SA within the entity TEIP3. If a finer-grained target is required, other steps might follow. The following keywords are available for you to specify other locations in terms of their relationship to this one: In the above definitions and elsewhere preceding and synonymous terms are to be understood as implying elements which would be encountered earlier when the document is processed correctly from beginning to end. The term pseudo-element is used for any string of PCDATA content occurring between SGML tags, which is not itself a complete SGML element, but forms part of one. The <p>element in the following example
<p>See <xref doc=TEIP3>The TEI Guidelines, passim</xref>
for a full discussion</p>
has three children: the second is an element (the <xref>) while the first and last are pseudo-elements (the pieces of content data containing the words `see ' and `for a full discussion' respectively).

Each of the above keywords implies a particular set of elements or pseudo-elements (the set of children, the set of ancestors, the set of previous siblings, etc.); to specify which of them you are pointing at, the keyword may optionally be followed by a parenthesized list containing:

Continuing the above example, the following reference will select the third <p> element directly contained by whatever element has the identifier SA:

<xptr doc=TEIP3 from='id (SA) child (3 p)'>

Note the difference between this and

<xptr doc=TEIP3 from='id (SA) child (3)'>
which selects the third child of the element bearing the identifier SA, whatever it may be. If entity TEIP3 contained the following text:
<div id=SA><head>Linking and Alignment</head>
<p id=Para1>Text of paragraph 1. </>
<p id=Para2>Text of paragraph <num>2</num>, which is rather short.</p>
<p id=Para3>Text of paragraph <num>3</num>, which is also rather short.</p></div>
the above <xptr> would reference the second paragraph above, because of the <head> element which is also a child element. Similarly, the following <xptr>
<xptr doc=TEIP3 from='id (Para3) child (3)'>
points to the pseudo-element `which is also rather short.' within the element with identifier P2

Rungs of the same or different kinds can be combined as required. Assuming for example that the entity TEIP3 is in fact a reference to the SGML form of the TEI Guidelines, then the following reference will select section 14.2.2 of that publication in which (as it happens) the extended pointer syntax is formally defined:

For full details, see
<xref doc=TEIP3 from='id (SA) child (2 div2) child (2 div3)'>
  TEI Extended pointer syntax definition

Complex specifications are easily built using this syntax. For example, the following reference will select the most recent <head> element which carries an attribute lang with the value LAT, and which occurs before the start of the element with identifier SA:

<xptr doc=TEIP3 from='id (SA) preceding (1 head lang lat)'>

You can define the target of a link with respect to the location of the link itself, rather than with respect to the root of the document, by using the keyword HERE. For example,

<xptr from="HERE ancestor(1)"> 
points to the parent of the element within which it appears;
<xptr from="HERE ancestor(2)"> 
points to the grandparent of the element within which it appears and so on. As this example also shows, when no value is supplied for the doc attribute, the current document is assumed. The HERE keyword makes no sense except as the first rung of a location ladder.