- Published date:
This pilot project will digitise rare and unique printed books from the British Library's South Asian printed books collection and enhance the catalogue records to automate searching and aid discovery by researchers.
At the end of 2015, an international partnership led by the British Library received funding from the Newton Fund to digitise rare material from its South Asian printed books collection. The Two Centuries of Indian Print project has digitised early printed Bengali books and in early 2018 received further funding to digitise books printed in Assamese, Sylheti and Urdu languages.
Highlights from the digitised collection are available through our online exhibition Early Indian Printed Books. To view all books digitised by the project visit the British Library catalogue selecting 'I want this' and “view digital content online”.
The project will also explore how digital research methods and tools can be applied to this digitised collection, and will deliver digital skills workshops and training sessions at Indian institutions to support innovative research within South Asian studies.
This project, is a partnership between the British Library, the School of Cultural Texts and Records (SCTR) of Jadavpur University, Srishti Institute of Art, Design and Technology, and the Library at SOAS University of London, working with the National Library of India, the National Mission on Libraries, and other institutions in India.
We hope to extend the project to other languages in further phases as further funding becomes available.
For media enquiries, please contact Ben Sanderson: email@example.com.
For the first time the project has made freely available in digital format the library's collection of bound Quarterly Lists. These are descriptive catalogue records of books published quarterly and by province of British India between 1867 and 1947. The Quarterly Lists are available to download as searchable PDFs and as OCR XML via the British Library's datasets portal, data.bl.uk.
In 2017 we ran a competition to find an optimal solution for automatically transcribing the Bengali Books and Quarterly Lists. Accurate transcriptions would enable researchers to search the full text of this valuable content. If you or anyone you know would like to use the competition data set to try OCR for this British Library material, they are freely available to download from the competition website. We would love to hear how you have found working with the dataset or if you would like to try OCR for our Assamese, Sylheti and Urdu books.
Events & Outreach
Workshop on Islam and Print in South Asia
Researchers from India, Bangladesh, Pakistan, Europe and America gathered for a series of two workshops, which took place at the British Library on 28 September and 26 October 2018. You can view the programme and abstracts of talks.
Between November 2016 and December 2018 the British Library hosted the ‘South Asia Series’ a series of talks inspired by the ‘Two Centuries of Indian Print’ project and the BL South Asia collection. The series featured academics and researchers from the UK, who shared their cutting-edge research. More talks are being organised for spring/summer 2019.
In July 2017, we held a two-day academic symposium at Jadavpur Universtity on South Asian Book History which brought together researchers, scholars and Digital Humanities practitioners from the UK, India, Bangladesh and Nepal. 25 speakers across 7 panel sessions discussed cutting edge research in the field. View abstracts from the panel sessions. Videos of the talks can be watched at bl.uk/early-indian-printed-books/vidoes
Digital Skills Workshops
In October 2017 our Digital Curator led a training event at the International Conference of Asian Libraries, held at Jamia Millia Islamia University, New Delhi. The event introduced librarians from all over India to the digitisation standards practiced by the British Library. A panel of speakers from the Centre for Studies in Social Sciences, Kolkata, and the Indian International Centre also shared digitisation undertaken at their institutions.
The same workshop was held again in July 2018 at the India International Centre in collaboration with the American Institute for Indian Studies and Ashoka University. The event was attended by Archivists representing academic and cultural institutions from across India, as well as from Cambodia and Australia.
In July 2017 we hosted the second of three skills-sharing workshops at Jadavpur University, Kolkata. The event Developments with Optical Character Recognition for Bangla addressed the challenges and opportunities of OCR and computational linguistics in opening up vast quantities of knowledge to digital researchers. Attendees from 10 different institutions and with backgrounds in information science, academics and computer science, experimented with a range of state-of-the-art OCR tools for Bangla, including open source Tesseract OCR. You can view a guide for how to install and use the latest version of Tesseract to obtain OCR for your own materials.
Participants in the 'Developments with Optical Character Recognition for Bangla' workshop experiment with different OCR tools
In December 2016, the first workshop took place at Jadavpur University, Kolkata, where library and information professionals from cultural heritage institutions in Bengal took part in a one-day event to learn more about how information technology is transforming humanities research today, and in turn Library services. View the agenda for the workshop.
You can support this project and help make the Indian Print Collection freely available online to all.