Front page to ‘The Calcutta Gazette’ quarterly list for the quarter ending 31 March 1875, SV 412/8

Quarterly Lists: Digitally Researching Catalogues of Indian Books

Tom Derrick discusses the value of applying digital research methods to bibliographic sources for scholars of Indian publishing history.

As well as digitising rare early printed Indian books, the Two Centuries of Indian Print project is making available online some wonderful catalogues held by the library, generally known as the Quarterly Lists, recording all books published quarterly and by province of British India between 1867 and 1947.

The catalogues complement the Bengali printed books, such as Koner Ma Kande, and I’d like to share a bit more about what the Quarterly Lists are and what we are doing to make them as accessible as possible for researchers of book history who want to apply digital research methods to explore their rich contents.

Firstly, a little more about the origins of these catalogues. With the passing of The (Indian) Press and Registration of Books Act, 1867 it became mandatory for all books published in provinces of British India to be sent to the provincial secretariat library for registration.  Both the India Office Library and the British Museum Library in London, later to be united in the British Library’s collection, were separately given the privilege of requesting books from these lists free of charge in what amounted to a colonial legal deposit arrangement. 

The act was passed with the aim of recording the ever growing number of publications originating from the various printing presses throughout India, its purpose political as well as archival.  Not all works that issued from the presses were recorded in the lists and only a small percentage were actually deposited in the London collections.  The library curators in London selected only those works which they thought were important or interesting.  The Quarterly lists were originally published as appendices in the official provincial newspapers, such as the Calcutta Gazette, and Bihar and Orissa Gazette.

Although Independence brought an end to the arrangement for depositing publications with the India Office Library and British Museum Library, the practice of publishing catalogues of registered printed books continued until the late 1960s.


Above: An interactive map showing the location and production of book printers in 1867 (July-December) Kolkata

Now digitised for the first time, we have applied optical character recognition to the Quarterly Lists to create ALTO XML for every page, which is designed to show accurate representations of the content layout. This enables researchers to apply computational tools and methods to look across all 100,000 pages of the lists to answer their questions about book history. Researchers are able to examine a rich seem of bibliographic data about books published throughout India, including the name and address of printers and publishers, price of publication and how many copies were printed. So if a researcher is interested in what the history of book publishing reveals about a particular time period and place, the full XML OCR and searchable PDF dataset can be accessed from data.bl.uk/twocenturies-quarterlylists/.

Through the Digital Research strand of the project we will be seeking out innovative research groups willing to take a crack at improving the character error rate and accuracy of tabular text recognition and extraction from the Quarterly Lists. With that in mind, we have launched a competition through the University of Salford’s PRIMA Research Lab, as part of the International Conference on Document Analysis and Recognition, taking place in Kyoto, Japan in November 2017. The competition seeks an accurate and automated transcription solution for the Bengali books as well as the Quarterly Lists. So if you or anyone you know would like to enter, do please register and you could be contributing to this landmark project, and picking up an award for your troubles! 


If you are interested in using the Quarterly Lists in your research or simply want to find out more about them, feel free to drop me an email; Tom.Derrick@bl.uk or follow more about the project @BL_IndianPrint
Tom Derrick British Library picture
  • Tom Derrick
  • Tom is a member of the Two Centuries of Indian Print project that will digitise and make available rare Indian books held at the library. He is interested in the creation of datasets through exploring OCR technologies and supporting researchers applying digital research methods to the collection.

    Prior to joining the Library Tom gained several years of experience across editorial development, production and marketing functions for an academic publisher of digital learning resources, collaborating with an international array of libraries to digitise their collections. 

The text in this article is available under the Creative Commons License.