Information for institutions: how we collect theses and how the data can be re-used
We work with all UK universities to build a comprehensive index of all UK theses and provide download options where available.
Thesis records and metadata are gathered from various sources while the full theses are ingested only with the agreement of each institution. Their participation in EThOS is governed by a light-touch Memorandum of Understanding signed between each institution and the British Library.
Here we describe how we aggregate the data, and provide guidance for participating institutions. We describe how EThOS harvests, uses, displays and shares each metadata field such as 'Awarding Body', and how researchers can re-use the data for their own research.
We harvest doctoral thesis metadata from all UK institutional repositories and add the harvested data to EThOS. A careful matching process updates existing EThOS records with additional data, and new records are created where one does not already exist. Each UK Institutional Repository is harvested six times a year using the OAI-PMH harvesting protocol standard.
Many institutions hold older thesis records in your library catalogue rather than your repository. If you have thesis records that are not in your repository, send us a data file from your catalogue in Marc or CSV and we’ll convert the data to add to EThOS.
If you have theses for which there is no electronic record at all, please contact us to discuss options for data creation.
EThOS lists around 97% of all UK thesis titles. If you have known thesis titles which are not listed in EThOS please get in touch and help us to fill those gaps.
UK theses are normally described using a core set of bibliographic metadata called the UKETD_DC application profile. This was developed to ensure UK theses are described in a clear and consistent way, allowing users to find the theses they seek and institutions to share the data between repositories. This additional table describes all mandatory and optional metadata fields, with tips on how to capture the data in your own repositories.
EThOS data can be re-used by participating institutions. The full dataset or subsets of metadata are also available to download or harvest for research or re-use by other organisations, discovery services and research projects seeking quality data for analysis.
Over 60% of all known theses are available in digital format. Where a digital copy exists, users can download it from EThOS or follow a link in the EThOS record to the institution’s own copy.
We ingest the files into EThOS only with permission from each institution as indicated in their Memorandum of Understanding. Many institutions consider EThOS a useful back-up database for their theses and value the wider availability of their research. A small number prefer to manage all re-use of their theses directly through their own repository.
By agreement with each institution, EThOS also provides seamless digitisation on demand for print theses. Institutions may opt in or out of this service, and can fund the digitisation themselves or pass the scan cost on to the user. The list of participating institutions indicates which institutions offer funded and unfunded digitisation on demand.