Depositing electronic journals: technical guidance notes
Publishers are encouraged, but are not under any obligation, to deposit a copy of their electronic journal publications. The purpose of these notes is to provide information about the methods of depositing that a publisher may choose. It is essential that publishers discuss and agree a mutually acceptable process with the Library.
In these notes:
- ‘Preferred’ methods are supported by the British Library; publishers are requested to use ‘preferred’ methods wherever practicable.
- ‘Acceptable’ methods are also supported by the British Library but their availability may be limited by resources or other constraints.
- ‘Unsupported’ methods are either unavailable or may only be used in exceptional circumstances.
Links to guidance notes
Deposits, especially those containing multiple files, should be collated into one package for each journal article (or issue). TAR bundles, ZIP or GZIP files may all be used, for delivery by:
- FTP push, i.e. upload to the Library’s FTP site; or
- FTP pull, i.e. upload to the publisher’s own FTP site for subsequent download by the Library.
Where appropriate, delivery should be secured either by FTPS (FTP over SSL) or by SSH File Transfer Protocol. Logins, passwords or encryption keys should be agreed and shared in advance.
A consistent manner of naming files and directory structures should be used. Titles should be distinct and each article or journal issue should be separately identifiable.
Publications consisting of a single digital file (e.g. PDF) without separate metadata may be delivered by:
- Email, provided that it is in an agreed manner, in an agreed format or structure, and to a Library-specified address.
- Making the complete publication openly available for download from a specified web address.
In exceptional circumstances delivery on tape, CD, DVD or other hand held media may be permitted by mutual agreement.
Publishers should notify the Library, in an agreed manner, each time that material is deposited or made available for download, using FTP trigger files, RSS/Atom, OAI-PMH or ONIX for serials.
Notification should include a manifest or summary in an agreed format of what the complete deposit package contains (e.g. listing each digital file by name, with size and type etc), so that the Library may check for safe receipt.
Notification may be by email, provided that it is in an agreed manner, using an agreed format or structure, and to a Library-specified address.
The manifest or summary of what the deposit package contains may be manually created, or may not be provided at all.
Notification by post or telephone, or by email if not by prior agreement.
The Library will check each file received to ensure that it appears to be complete, virus-free and syntactically valid. Use of checksums is preferred.
It is preferred that publishers supply a manifest or list of all relevant digital files in the deposit package (see Preferred options under Notification of Deposit) so that the Library may check for safe receipt of the complete package.
A process for managing and resolving any errors should be agreed in advance.
A process should also be agreed in advance for dealing with any subsequent errata, legal notices, amendments or other updates to content which has already been deposited and incorporated into the Library’s collection.
Preferred / Acceptable:
Content may be submitted in any of the following ways, expressed in descending order of preference:
- Full text XML, using the National Library of Medicine’s Archive Interchange DTD version 2.3, and with page representation also supplied in PDF
- Full text XML, using a different DTD/schema which is also supplied with the deposited content, and with page representation also supplied in PDF
- PDF page representations, with XML metadata headers using the National Library of Medicine’s Archive Interchange DTD version 2.3
- PDF page representations, with XML metadata headers in a different DTD/schema which is also supplied with the deposited content
- PDF page representations only
- Pages in HTML, RTF or other formats which are in common and widespread use and which are either reusable without licence or where it is reasonable to expect that the Library already has licenses (such as Microsoft Word)
Where possible, PDF files should be derived from a typesetting or desktop publishing process rather than scanned images in a PDF wrapper.
The following file formats for non-text types of content will all be accepted:
- Still images in GIF, TIFF, JPEG, JPEG 2000 or PNG, plus other file formats in widespread use, as appropriate.“In line” images should be included with any HTML or XML deposits
- Audio content in AIFF, WAV, MPEG (MP3) or other file formats in widespread use, as appropriate.
- Video content in AVI, MOV, MPEG (MP4) or other file formats in widespread use, as appropriate
Interactive elements may be acceptable, provided that the software or scripts required to render/use them is also contained with the main content (such as Java); however their successful reproduction in a repository/archive environment cannot be assured.
Other file formats that are not in common or widespread use
Publishers should supply metadata which describes the items deposited, for the purposes of indexing, discovering, identifying and selecting the digital resource.
Metadata must include at least one unique identifier, preferably one that is externally recognisable such as a DOI (CrossRef) or ISSN with the volume and issue number. The identifier must be one which can be used to identify the publication at a later date in case of amendments, errata or requests for notice and takedown.
Metadata must be:
- Standard, self-describing, and interoperable (a copy of the DTD/schema must be provided if it is not NLM)
- Accurate and, where possible, validated using agreed validation rules
- In one of the following formats, in descending order of preference:
- XML headers using the National Library of Medicine Archive and Interchange DTD version 2.3
- XML headers using a different DTD/schema which must also be deposited with the content
- A copy of metadata supplied to CrossRef for DOI registration purposes.
Publishers are not required to generate metadata when depositing journals which are published only as simple page representation files (PDF) or as HTML web pages without accompanying metadata.
Preferred / Acceptable:
All Technical Protection Measures (TPMs) and controls that are used to enforce digital rights policies must be removed by the publisher prior to depositing.
The Library’s repository is secure; separate controls will be implemented to govern how content deposited by publishers may be used and to protect against unauthorised access.
TPM-controlled publications that are deposited without any means to ‘unlock’ them may be byte-preserved but cannot be made usable.