Google’s Plan to scan EVERY Book

Posted on: April 5, 2009

Just came across a really interesting article that i previously knew nothing about. Maybe it is because I am not really interested in books or else its a fact that is not widely known; that google are scanning every book to create an online archive of digital books. A digital library. Obviously with the creation of Reader Digital Book from Sony books are becoming digitised but google will stop at nothing until every book is available in digital form.



However, there is a hitch in Google’s plan to digitise the world’s books and make them searchable online: scanning them is taking too long.

That’s because character recognition software needs a neat 2D image of the text. But book bindings cause pages to arch up either side of the spine – bending text and making it hard to interpret.

However, last week Google was granted a patent (US 7508978) on an answer to this problem. Its trick is to project an infrared pattern onto the open page spread. This lets a pair of infrared cameras map the three-dimensional shape of the pages by detecting distortion to the pattern. This in turn allows the distortion of the text to be determined – and therefore the degree of correction needed to read it accurately.


