Oxford Text Archive
1. Introduction
2. Collections
Development
2.1 Scope of Collections
2.2 Collection Data Types
2.3 Criteria for Evaluating
Electronic Datasets
2.4 Acquisitions Methods
2.5 Acquisition Strategies
2.6 Rights Management
2.7 Payment
3. Collections
Management
3.1 Documentation
standards
3.2 Technical standards
3.3 Collections management database
4. Preservation
5. Access
and Use
5.1 Collections Catalogue
5.2 Data Use
5.3 Rights Management
5.4 Data delivery mechanisms
and tools
5.5 Payment
The OTA's Collections Policy follows an AHDS service-wide framework. It is written in four sections: Collections Development, Collections Management, Preservation, and Access and Use. Each of these defines a framework for activities within a broad area.
Collections Development focuses on the scope and nature of the content that will be collected by the OTA. Collections Management provides a framework for the administration, description and storage of digital data sets. Preservation defines longterm strategies for the archiving of collections. Access specifies where and how the OTA collections may be used, and identifies primary users.
This section outlines the nature and scope of the OTA's collections and collaborative activities.
The OTA collects high-quality scholarly electronic texts and linguistic corpora (and any related resources) of long-term interest and use across the range of humanities disciplines. These digital resources either support, or result from, UK-based Higher Education research activities. Rather than reproduce holdings which are available elsewhere, the OTA may seek to make these available to the UK Higher Education community through the development of data exchange and access agreements.
2.1.1 Discipline coverage. The OTA has traditionally operated a very broad collections policy, and has archived electronic texts and corpora of interest not only to literary textual scholars, but also those working in linguistics, history, law, modern and ancient languages, indeed almost any humanities discipline which relies upon a close reading of texts. In future, priority will be given to digital resources of interest to those working in the literary and linguistic disciplines (including modern and ancient languages), whilst continuing to take materials from any literary genre, period, or language. Other disciplines will be supported on a best-effort basis.
2.1.2 Chronological range. There will be no chronological limits to the OTA's collections, although the availability of modern works is likely to be constrained by copyright legislation.
2.1.3 Copyrighted works. To the best of our knowledge and belief all the resources held by, or accessible through the OTA, are either out of copyright or appropriate permissions have been granted to the OTA; if you believe this not to be the case, please contact us immediately.
2.1.4 Digital resources in the OTA's Collection will either be held by the OTA, or maintained by a Cooperating Agency, and made available through OTA access and distribution channels.It is helpful in this context to make a distinction between static and volatile digital resources. A static resource is a coherent collection of digital information that is in its final state, and will not be changed. Static resources may be archived and curated by the OTA and form part of its physical holdings. A volatile resource is one which is being actively edited by its creator, and where possible OTA policy will be to create metadata catalogue entries and to point to the online collections maintained by a Co-operating Agency ("virtual holdings"). In cases where the digital resource creator wishes to make a volatile resource available for reuse but lacks network access then the OTA may take a "snapshot" which is periodically refreshed.
2.1.5 Rather than duplicate existing services the OTA will prioritise its collection policy according to perceived gaps in the provision for electronic archives and so as to ensure the preservation of orphaned digital resources. It is anticipated that these priorities will develop in response to the emergence of a national digital archive policy within the UK. In the interim a number of guidelines are likely to be proposed by the relevant funding bodies and granting agencies (JISC, British Academy, Leverhulme, British Library etc.).
2.1.6 Non-digital resources. While the OTA will normally hold only digital information, paperbased information about digital files may be held. The OTA will not hold any non-digital archives (e.g. printed books, journals, monographs, theses, photographs, sound recordings etc.); however, it may pursue funding for the digitisation of key archives. The location of related paper records and other non-digital materials will be documented whenever possible.
The OTA's Collections will comprise the full spectrum of textual material, irrespective of data type. These will include:
2.2.1 Electronic texts. The OTA will consider accessioning electronic versions of both published and unpublished works, of any literary type, genre, or period, including transcriptions of manuscripts or other similar materials, and in any language.
2.2.2 Textbases and corpora. The OTA will consider accessioning electronic literary or linguistic textbases or corpora, of any type, genre, period, or language. These textbases and corpora must be primarily text-based in nature, as resources which consist entirely of, say, digitized audio material are unlikely to be accessioned.
2.2.3 Databases. The OTA will consider accessioning electronic databases, provided that they fall within our Scope of Collections policy (e.g. data for a study of authorship attribution).
2.2.4 Digital image data. The OTA will consider accessioning digital image data, provided that they fall within our Scope of Collections policy (e.g. illustrations from a text, digitized images of manuscript pages or a particular printed edition, etc.).
2.2.5 Hypermedia. The OTA will consider accessioning hypermedia digital resources, provided that they fall within the Scope of Collections policy (e.g. a hypermedia edition of a particular work, or a hypermedia teaching resource about a particular author or genre, etc.).
2.2.6 Applications software. The OTA will not normally collect applications software developed for use with electronic texts or linguistic corpora, including teaching materials, unless it exists in a form which makes it viable for future use.
Through work with the user community the OTA will seek to identify those types of electronic text and linguistic corpora which have potential reuse value. The OTA anticipates that it will be of considerable benefit to both depositors and users that there be an effective and rigorous process of peer review of materials proposed for accessioning.
Electronic texts and linguistic resources which are offered for deposit to the OTA will be evaluated to:
Whereas the first form of evaluation involves assessment of the content of a digital resource, the second focuses more on data structure and format, and on the nature and completeness of any documentation supplied. The third evaluation criterion is intended to prevent duplication of digital archiving efforts within the community, and to preserve the integrity of existing digital archives. Such evaluation is essential to determine how best to manage a digital resource for the purpose of preservation and secondary reuse, and also to determine what costs may be involved in accessioning and migrating the digital resource.
An electronic text or linguistic corpus that does not meet all these criteria may not necessarily be rejected, particularly if it has significant reuse value. Instead, documentation accompanying such resources will be modified to warn users of potential pitfalls.
2.3.1 Assessing intellectual content and potential for reuse
The OTA seeks to accession high quality material which will facilitate future humanities research. A rigorous review process will endeavour to ensure that the content of electronic texts and linguistic corpora are of the highest intellectual quality, created and prepared according to conventional standards of scholarly quality. However, other types of quality issues will also be taken into account:
Rarity: an electronic text or linguistic corpus that is of interest to one or more specialist communities shall be deemed to have rarity value. Resources which are readily available in other forms (e.g. as printed books), or in other electronic formats (e.g. as wordprocessor files), may also be deemed to have rarity value if they are not otherwise readily available in a particular format (e.g. as an electronic file, tagged with TEI-conformant SGML markup). Likewise, a resource whose longevity and future accessibility might otherwise be threatened (e.g. because it is currently only available on one particular web server) shall be deemed to have rarity value.
Reusability: an electronic text or linguistic corpus which is either known or believed likely to be of interest and relevance to members of the academic community, shall be deemed to have high reuse potential. Even if a resource is readily available elsewhere, it may be considered suitable for accessioning with the OTA, if it is deemed to have significant reuse value.
Accessibility: if the accessioning by the OTA of an electronic text or linguistic corpus will make that resource more accessible to members of the academic community (and thus improve its potential for reuse), this should be considered as part of that resource's evaluation.
2.3.2 Evaluating viability for management, preservation, and distribution
Although a resource may meet one or more of the evaluation criteria outlined above, it is still important to assess the likely cost implications of accessioning such a resource. This is particularly important with regard to the long-term management and preservation of a resource, as developments in hardware and software may require significant efforts on behalf of the OTA to ensure that a particular resource remains accessible to, and reusable by, members of the academic community. In those cases where it is not cost-effective to maintain the accessibility and reusability of a resource, this is likely to have an impact on the effort and worth of continuing to distribute that resource.
Format: the OTA provides extensive information and recommendations concerning its preferred format for deposited electronic texts and linguistic corpora [LINK TO RELEVANT PAGE(S) IN DEPOSITORS' PACK, CREATORS' GUIDELINES ETC.?]. Resources which have been prepared in accordance with these recommendations will be deemed to have low management, preservation, and distribution costs; whilst such considerations shall not preclude the OTA from accessioning resources which have not been prepared in accordance with these recommendations, they will be taken into account when evaluating a resource for accessioning.
Documentation: the OTA provides extensive information and recommendations concerning the documentation which should accompany any resource offered for deposit [LINK TO RELEVANT PAGE(S) IN DEPOSITORS' PACK, CREATORS' GUIDELINES ETC.?]. Whenever possible, documentation should be appropriate and sufficient to satisfy the requirements for reuse by other members of the academic community, including the staff of the OTA who will be required to catalogue the resource. A poorly documented resource is likely to be much less reusable, and more expensive for the OTA to accession, whilst the opposite is true of a resource that has been prepared in accordance with the OTA's documentation recommendations. Whilst poor documentation shall not preclude the OTA from accessioning a particular resource, this factor will be taken into account when a evaluating a resource for accessioning.
2.3.3. Determining need of primary archival home
There is no need to duplicate digital archiving services. If an electronic text or linguistic corpus (or other digital resource of interest to the OTA and its users), is being properly cared for by an organization other than the OTA, it will not be given high priority for accessioning. Similarly, if a resource is offered for deposit with the OTA, the OTA reserves the right to recommend that it should be offered to an alternative archival service if the OTA believes that this would be a more appropriate place for that resource.
The electronic texts and linguistic corpora in the OTA's Collection will be acquired by a limited number of means. Consistent approaches will be developed to each kind of sources.
Electronic texts and linguistic corpora will enter the OTA's Collection by deposit under the AHDS Common Deposit Agreement. See Rights Management Framework. Common Deposit Agreement
Electronic texts and linguistic corpora will also be made available to OTA users through Cooperative Agreements signed with other agencies to enable access to and/or distribution of electronic texts or linguistic corpora held in other collections.
Whilst the OTA is unlikely to purchase or license electronic texts or linguistic corpora directly, it may approach other bodies for funding to acquire an electronic text or linguistic corpus of particular value.
2.4.1. Direct Deposit by Individuals or Institutions.
The OTA will acquire and store electronic texts and linguistic corpora produced by individuals, projects or institutions. This strategy is preferable in the case of fixed or static resources.
As a condition of acquisition, the OTA will negotiate the broadest possible assignment of rights to guarantee access and enable redistribution of the digital resource.The OTA will negotiate a nonexclusive license to distribute deposited electronic texts and corpora. Resources with severe restrictions will be accepted only under exceptional circumstances.
2.4.2. Collaboration with other Agencies.
Where appropriate the OTA will negotiate data exchange or access agreements with other organizations. These agreements will ensure similar levels of data integrity and access as offered by the OTA in order to maintain consistency across the collections.
Priorities for acquisitions will be defined by the OTA in consultation with its Advisory Board and Consultative Group, in conjunction with the AHDS Executive.
2.5.1. Agreements with CoOperating Agencies.
Where significant bodies of material are held by other agencies, the OTA may pursue Cooperative Agreements for the exchange of catalogue data and access to information, in preference to direct acquisition. The OTA will monitor the collecting activity and scope of other agencies to identify opportunities for collaboration.
2.5.2. Agreements with Granting Agencies.
The AHDS will seek to sign agreements with granting agencies which support humanities research, to encourage funding recipients to offer their datasets for deposit with the AHDS Service Providers.
The Executive will seek to strike and sign such agreements with funding agencies which support research in two or more humanities disciplines.To date such agreements have been reached with:
The OTA will seek to strike and sign such agreements with granting agencies which fund research exclusively or primarily in text- or linguistic-based disciplines. To date, no such agreements have been reached.
2.5.3. Retrospective Survey of Past Grant Holders.
In conjunction with agreements to acquire electronic texts and linguistic corpora produced as the result of new awards, the AHDS and OTA will contact previous grantholders, soliciting information regarding digital resources that may have been produced under previously funded projects. Where appropriate, grantees will be contacted regarding possible deposit.
2.5.4. Acquiring Data from Individual (and Institutional) Depositors.
The OTA will accession digital resources from individual and institutional depositors.
The OTA will inform depositors and potential depositors about the deposit process, and provide guidance to them in the form of the OTA Depositors' Pack [INCLUDE A URL TO THE RELEVANT PAGE].
The OTA will negotiate deposit agreements and access conditions directly with depositors and ensure that all deposited resources are accompanied by a signed Common Deposit Agreement, a Schedule of Deposited Materials, a Technical Description of Deposited Materials, a Study Description providing appropriate documentation, and appropriate details about the depositor. [INCLUDE AN APPROPRIATE URL?]
2.5.5. Collection Analysis and Focused Acquisitions
Rather than duplicate existing services the OTA will prioritise collecting activity, according to perceived gaps in the provision of electronic archives. The OTA will regularly monitor the use of its collections in order to identify areas where materials are in high demand. Collections Development activities may be directed to supplement these areas. In consultation with its Advisory Board and Consultative Group, and as a result of analysis of its collections, the OTA may adopt specific strategies to strengthen its holdings in under-represented areas, or to develop comprehensive holdings in a specific area.
The AHDS and the OTA will use their expertise to guide the activities of those responsible for the acquisition of Content for the UK Higher Education Network, including the Content Working Group of the JISC Committee on Electronic Information.
2.5.6. Resource Creation
The OTA will not normally enter into digital resource creation projects, but if a significant resource is identified as missing, the OTA may enter into partnerships with agencies to advise on resource creation, or endorse funding applications targeted at resource development.
[NB IN THE AHDS EXECUTIVE'S INTERNAL DRAFT OF THE COLLECTION POLICY, 3 ADDITIONAL SUB-SECTIONS APPEAR AT THIS POINT COVERING "ACQUISITIONS PROCEDURES", THE CENTRAL "COLLECTIONS DEVELOPMENT DATABASE", AND THE "DISPOSITION OF DATASETS". WE MAY WANT TO INCLUDE SOME/ALL OF THAT MATERIAL AT THIS POINT?]
A Common Deposit Agreement for use with individual and institutional depositors, and Model Cooperating Agreements for use with Cooperating data collecting agencies will be implemented to protect the rights of depositors and users. See Rights Management Framework. Common Deposit Agreement
2.7.1. Payment for Acquisitions.
While the OTA is unlikely to purchase or license digital resources directly, it may approach other bodies, such as the Committee on Electronic Information's Content Working Group for funding to acquire a dataset of particular value.
2.7.2. Receiving Payment for Services.
Any charges levied on depositors by the OTA will conform to the AHDS's Common Charging Policy. See Rights Management Framework. Common Charging Policy
Collections Management procedures and systems will be developed by the OTA. These will be based on an AHDS-wide framework of documentation and technical standards, and indicate target service (or performance) levels. Standards will be documented in the AHDS' "Managing Digital Collections" handbook.
A framework of technical standards will define the file formats in which data will be accepted and stored, and will outline the hardware/software environment within which the OTA Collections will be catalogued, maintained and distributed.
* File and data interchange formats, and storage media. The OTA will define a series of preferred and acceptable file and data interchange formats for all categories of data that are offered for deposit. In addition, the OTA has taken a service-wide responsibility for electronic texts and linguistic corpora and will specify which file and interchange formats and storage media are preferred/required from depositors, used to maintain its holdings, and preferred/required for delivering such resources to users.
* Network/Communications Protocols. A framework for communications both within the OTA, within the AHDS as a whole, and with the community at large will be defined. This will emphasise interoperability and compatibility with the JANET and SuperJANET infrastructure and with the Internet more broadly.
* Network Security. A framework for maintaining the security and integrity of digital resources and their documentation will be developed and implemented by the OTA.
Unlike the other AHDS Service Providers, the OTA has no plans to maintain a conventional separate Collections Management Database, but will instead extract all necessary information from each resource's associated metadata, to allow the OTA to provide descriptive, technical, and administrative data for documenting its collection. Extracts of such information will be made available to uses of the collection, in the form of the collections catalogue. Other recorded information may be considered confidential.
The longterm value of digital resources requires investment in their maintenance over time. The AHDS will research and implement technical and organisational strategies to ensure the longterm viability of its collections. In aid of this research the OTA will document:
* The importance of digital resource preservation to the community of users of electronic texts and linguistic corpora
* Particular issues involved in the preservation of digital resources appropriate to that community
* OTA digital preservation practices. The latter may include information under the following heads which shall be further developed by the AHDS Executive.
The OTA encourages broad access to and use of its collection, both within the UK research community and beyond. Systems will be developed that enable the searching of the entire OTA collection, the collections of co-operating agencies, and the collections of other AHDS service providers, and that facilitate the delivery and use of electronic texts and linguistic corpora for a full range of teaching and research functions.
All items in the OTA collection and digital resources held by collaborating agencies will be described in the OTA's Collection Management Database. Portions of this database will be made available to users of the collections as a catalogue.
The online catalogue of the OTA's collection will be developed and be made widely and freely available. The OTA Catalogue will include a common core of shared metadata that will enable researchers to search across the holdings of all AHDS Service Providers and across those of collaborating agencies.
Integrated access will be facilitated by adopting the Z39.50 network protocol. Access to the OTA's on-line catalogue will be Z39.50 compatible.
The OTA will investigate ways of aiding and encouraging the use of its collections. These may include, but will not be limited to workshops, training events, lectures, and the development of Guides to Good Practice. If necessary, the OTA may support the design and development of tools to support data analysis, although it is unlikely to undertake development alone.
The OTA will operate as a subscription service. Any users will be able to acquire information about its holdings. Only registered users will normally be able to browse, order, or otherwise acquire access to the digital resources which make up the OTA's collection.
All users will register once with the AHDS with online registration facilities developed for the Executive. In registering, users will agree to the conditions of use specified in a Common Access Agreement and those which are attached to any digital resources (e.g. electronic texts or linguistic corpora) they may acquire from the OTA. Users to whom any digital resources are distributed will fill in an online Users' Details Form giving an indication of how and why they propose to use the resource.
Digital resources will be distributed to users by a limited number of means, specified by the OTA, and notified widely.
Where appropriate the OTA may develop specific tools or working environments for the delivery and use of significant or highdemand digital resources.
Information will be supplied to users and potential users about:
All users will be required to:
Any charges levied upon users by the OTA will conform to the AHDS's common charging policy.
![]() |
Updated last by M G Popham on December 22, 1997 |
