ScanRobots for Mass Digitisation – even for Delicate Books
Mr Brantl, in July 2007 the Bavarian State Library started a mass digitisation project funded by the German Research Community (Deutsche Forschungsgemeinschaft). Within two years almost 37 000 printed works in German with a total of over 7.5 million pages from the period between 1518 and 1600 are to be digitised …
Well, our overall target for the next five to ten years is to make all copyright-free material in the Bavarian State Library freely available on the internet. That’s happening firstly through the cooperation with Google that we’ve started. Secondly it is happening within the context of independently-funded projects, for instance the one you mention.
What problems do books from the 16th century cause when you scan them?
We are talking about material that is very challenging from the point of view of preservation. 70 per cent of these old books can only be opened to an angle of 90 to 100 degrees. If you try to open them any further, the spine breaks. But as an archive library it is required of us to preserve our stocks for ever, so to speak. This means that we can’t treat our works in this way – and repairing the spine of such a book costs around 2000 Euro.
What does that mean for the scanning process?
If I can’t open a book to an angle of 180 degrees, I can only scan one page at a time using the conventional book-scanning system. For this reason we looked around for an alternative scanning solution.
Now you’re using ScanRobots made by TreVentus, which have been developed for this project in a joint venture with your Digitisation Centre and the BSB Institut für Buch- und Handschriftenrestaurierung (Institute for Restoration of Books and Handwritten Documents).
Yes, with these the books are placed on a cradle, upon which they are opened very gently to an angle of just 60 degrees. The robot has a triangular scanning head, on the tip of which there is a prism. The scanning head moves into the book binding with this prism. It lifts both pages with suction created by a vacuum and photographs them as they lift. The advantage for us is that we can record two pages with one scan, thereby significantly increasing throughput of material that is difficult to preserve.
At the moment you are using three ScanRobots. Are there plans for more?
At the moment we have a problem with space. Just now there are 15 scanning systems on our premises, with which we can scan original documents of sizes up to A0, including the three robots. We cannot currently accommodate more here.
How much does a machine like that cost?
Around 80 000 Euro – putting the ScanRobot in approximately the same price bracket as other large-format book-scanning systems.
How many pages per hour can a robot like that do?
You can’t give a generalised answer to that question. Of course, performance is dependent on the original material. In the 16th century for instance we have to deal with widely varying types of paper: from very soft to very stiff. The robot has trouble with very stiff paper. It performs more slowly or not at all. On the other hand if the paper is softer and can be lifted more efficiently by suction, so that it sits evenly on the scanning head, then we can achieve up to 900 pages per hour. For modern printed material from the 19th or 20th century this figure can be as much as 1 300 pages.
Can all old books be scanned with the robot?
No. In the 16th century every book is different – just from the point of view of how it is bound and the texture of the paper. You need to pre-select the material. In this project we learnt that there are even books that can’t be opened at all.
In order to achieve your ambitious project target, mass digitisation at the Munich Digitisation Centre is done using a sophisticated work process …
Yes. In simple terms the process, which is also supported by our ZEND software that we developed in-house, looks like this: first the scan operators sort the magazines and establish which books are robot-compatible and which ones are not. The robot-compatible volumes go on to the librarians who put together the scan jobs and do the necessary cataloguing work.
After that the books go to the scanning centre. We have been running four shifts here since April 2008 – from 7 am until 11 pm. Incidentally the book covers are scanned in a separate process. The robots can’t do that yet. When the robot has finished scanning the book pages, this scan is joined together with the book cover.
Then everything goes to quality control. From there the images are collected and processed overnight using a server-side process, before they are transferred to the long-term archive after the final quality check. These processes all run automatically. While they are running, the number of files that are being made available to web users is calculated, among other things. These are jpeg files in two different resolutions and a pdf file with the entire contents of the book.
Finally someone hits the initialisation button. From now on you can view the book on the internet or download the pdf and read it on your E-Book reader or on your computer.
How long does the process you have just described take?
Normally the whole thing – from taking it off the shelf to putting it back there – lasts two to three weeks. Of course there are often several book titles bound together in a single volume. The important thing is to design the work process in such a way that the flow of books to the robots is consistent. After all, we are currently scanning around 300 titles per week in this project.
She works as a freelance publicist in Bonn
Translation: Jo Beckett
Copyright: Goethe-Institut e.V., Online-Redaktion
Any questions about this article? Please write to us!