- Published date:
This pilot project will digitise more than 1,000 rare and unique Bengali printed books and enhance the catalogue records to automate searching and aid discovery by researchers.
At the end of 2015, an international partnership led by the British Library received funding from the Newton Fund to digitise rare material from its South Asian printed books collection. The Two Centuries of Indian Print project will digitise more than 1,000 early printed Bengali books, amounting to more than 400,000 pages. It will also explore how digital research methods and tools can be applied to this digitised collection, and will deliver digital skills workshops and training sessions at Indian institutions to support innovative research within South Asian studies.
This pilot project, to run from 2016 to 2018, is a partnership between the British Library, the School of Cultural Texts and Records (SCTR) of Jadavpur University, Srishti Institute of Art, Design and Technology, and the Library at SOAS University of London, working with the National Library of India, the National Mission on Libraries, and other institutions in India.
We hope to extend the project to other languages in further phases as further funding becomes available.
For media enquiries, please contact Ben Sanderson: firstname.lastname@example.org.
Visit our online exhibition Early Indian Printed Books, to view highlights from the digitised books and read a variety of thematic articles including on gender, food, and the development of the Indian railways.
All of the digitised books made available online through the project can be discovered by visiting the British Library catalogue selecting 'I want this' and “view digital content online”.
For the first time the project has made freely available in digital format the library's collection of bound Quarterly Lists. These are descriptive catalogue records of books published quarterly and by province of British India between 1867 and 1947. The Quarterly Lists are available to download as searchable PDFs via the British Library's datasets portal, data.bl.uk.
We recently ran a competition to find an optimal solution for automatically transcribing the Bengali Books and Quarterly Lists. Accurate transcriptions would enable researchers to search the full text of this valuable content. If you or anyone you know would like to use the competition data set to try OCR for this British Library material, they are freely available to download from the competition website
Events & Outreach
Asia and African Studies, British Library is pleased to announce the continuation of the ‘South Asia Series’ through May 2018; a series of talks inspired by the ‘Two Centuries of Indian Print’ project and the BL South Asia collection. We have a great line-up of academics and researchers from the UK and abroad in the upcoming months, who will share cutting-edge research, with discussion chaired by curators and specialists in the field.
View full abstracts for upcoming talks.
All seminars are held in the Foyle Suite, Conservation Centre, British Library, 17:30 - 19:00
Preeti Khosla (Idependent)
Wednesday 30th May 2018
Digital Skills Workshops
In October 2017 our Digital Curator led a training event at the International Conference of Asian Libraries, held at Jamia Millia Islamia University, New Delhi. The event introduced librarians from all over India to the digitisation standards practiced by the British Library. A panel of speakers from the Centre for Studies in Social Sciences, Kolkata, and the Indian International Centre also shared digitisation undertaken at their institutions.
In July 2017 we hosted the second of three skills-sharing workshops at Jadavpur University, Kolkata. The event Developments with Optical Character Recognition for Bangla addressed the challenges and opportunities of OCR and computational linguistics in opening up vast quantities of knowledge to digital researchers. Attendees from 10 different institutions and with backgrounds in information science, academics and computer science, experimented with a range of state-of-the-art OCR tools for Bangla, including open source Tesseract OCR. You can view a guide for how to install and use the latest version of Tesseract to obtain OCR for your own materials.
Participants in the 'Developments with Optical Character Recognition for Bangla' workshop experiment with different OCR tools
In December 2016, the first workshop took place at Jadavpur University, Kolkata, where library and information professionals from cultural heritage institutions in Bengal took part in a one-day event to learn more about how information technology is transforming humanities research today, and in turn Library services. View the agenda for the workshop.
You can support this project and help make the Indian Print Collection freely available online to all.