Observations: Google Book Search

by Jim Lyons

The Hard Copy Observer, July 2008

Google Book Search

Google (NASDAQ GOOG) Book Search (GBS for our purposes) has been on my radar for the last few years. The search giant’s “humble” goal to digitize millions of books appealed to me right off, as it just seemed to have that ring of over-achieving for which Google has become famous. While I’ve blogged about GBS a few times briefly in the past, I have included the following disclaimer in my print-industry blog, that GBS “...sits outside my general area of interest of printers, but hey, they use scanners to implement it, right?”

But now that one of our printing industry celebrities has made a bit of a stand on the importance of GBS, it’s time for me to shed the disclaimers and open up the topic for regular blog and column fare. It was only recently (May, 2008), when Vyomesh Joshi, executive vice president of HP’s (NYSE HPQ) Imaging and Printing Group, while speaking to the press at the quadrennial printing fair drupa in Düsseldorf, Germany (see the July issue of The Hard Copy Observer for full coverage of drupa), evoked the significance of GBS. Though I wasn’t in attendance for his speech, I eagerly read the Reuters account and immediately welcomed the entrance of GBS into the relevant world of printing. According to Georgina Prodhan of Reuters, Joshi predicted, “After the digitalization of music and photos, which are already well advanced, labels, marketing materials, and books will follow in the next few years.” Joshi referred to Google’s ambitious project to scan all the world’s books that are out of copyright, and said that the firm has already gathered digital versions of more than one million books. “If you want to get a book that’s out of print, digital is the way to do it, because you only print one copy,” said Joshi.

Now to my assessment of how and why GBS might be important to the printing industry, and whether or not I agree with Joshi, but first a little history. Unlike so many trends and developments that seem to bubble up and merge into one’s consciousness and that seem to have been formed from bits and scraps without any clear origin, I can distinctly pinpoint the source of what got me thinking and observing developments related to GBS.

Originally dubbed “Google Print” at its unveiling in November 2004, it was Michael J. Miller, in PC Magazine, dated December 27, 2005, that teased me with a piece titled, “Why Google Print is more important than you think.” With such a title, I felt compelled to read the article, and shortly after, a professional researcher and writer that I greatly respect confirmed his agreement that GBS was definitely something to keep an eye on.

Miller’s article reviewed a bit of history on Google’s approach to capturing out-of-copyright as well as in-copyright works and discussed a bit of the controversy regarding the latter, with its opt-out feature for publishers not wanting their works scanned by Google. He explained that this capture function constituted a different act altogether than the copyright-violating intent of a Napster, for example, whose memory, if not the actual company, still was front-and-center in 2005. Miller defended Google, urged publishers to see the light, and argued how a Google Search to a snippet of captured text from a book could actually lead Web searchers to buy the books in question.

Fast forward to 2008, and some of the controversy remains. Competing efforts have arisen, including those with participation from Microsoft (Live Book Search, since shuttered) and Yahoo and Microsoft (Open Content Alliance, with an opt-in, as opposed to opt-out, approach for copyright holders). Hmmm, Google against Microsoft and Yahoo, why does this sound familiar?

GBS continues to scan copyrighted as well as out-of-copyright books, and though the company doesn’t disclose the data, outsiders estimate a scan rate of more than 3,000 books per day. With Joshi’s quoted cumulative total of more than one million scanned books now more than a year old, the new total, if those rates still apply, would put the online library at more than two million books.

The figure above (click through for better detail) illustrates a screen shot of a search for Gone With The Wind, a book that is still under copyright. GBS displays a few highlights or snippets and provides links to popular booksellers like Amazon and Barnes and Noble for would-be buyers. The “find this book in a library” took me to my local library’s search function with just a few more clicks. Given the right mindset (as Miller suggested in 2005), this functionality could be seen as a major “win-win” for owners and potential readers of the copyrighted materials, in addition to a source for out-of-print publications.

But what about printing? Joshi would have us think this huge library of digitized content can lead to printing opportunities galore. It’s interesting, though, that he would raise this topic at drupa, where HP was touting its impressively expanding line of commercial printing devices. It seems to me that personalized printing, as it may someday be emphasized by GBS, could potentially take away from centralized, commercial printing, even the customized, shorter runs that digital presses offer, including but not limited to the one-off out-of-print copies he mentions.

With the opportunity presented by Scribd and their “YouTube for documents” model (see my May 2008 Observations and also the Observer, 5/08) — existing desktop printers and simple stapling would probably suffice for converting most (shorter) documents to hardcopy. But when books are involved, even if not whole books but at least chapters or sections, it seems a desktop device with high speed but very low cost per page (a la the promise of Memjet), combined with affordable and easy-to-use document-binding capability (either physically combined or at least on the same desktop), could be a winning personal book-printing solution to go along with a capture, store, search, and royalty-capture engine like a future version of GBS, provided, of course, the whole world hasn’t jumped to Amazon Kindles!

Jim Lyons Observations

Search This Blog

Observations: Google Book Search

Comments