sfasffwesdc: Google's Digital Library of Alexandria

Sunday, August 13, 2006

Google's Digital Library of Alexandria

Google's founders have intended to digitize library books since they were students at Stanford. The idea that you can find and read books by typing some keywords in a program may seem great for a library, but when it comes to digitizing all the books in the world, material obstacles interfere: many books are copyrighted and are sold in bookstores. Google has started to scan public domain books and out-of-print books in 2004 and wants to continue the process with the rest of the books. Google also has partnerships with some publishers and universities like Stanford.

"When Google announced the library scanning project, in December 2004, it had four library partners besides Stanford. Two of them (Oxford University and the New York Public Library) took a legally cautious approach to digitization, permitting Google to copy only public domain works. A third, the University of Michigan, took the opposite view, asserting forcefully that Google could scan every one of its 7 million books. Harvard hedged its bets, initially agreeing only to a limited test program. Last week, the University of California signed on as a sixth Google partner. Its scanning program will include both public domain and copyrighted material," reports Washington Post.

Last year, Author Guild and major publishing houses like McGraw-Hill and Penguin Group have sued Google, for scanning books without permission. Google says their digitizing process doesn't infringe copyright, as it's a transformative process covered by the fair use. Google also compares scanning and indexing books with crawling and indexing web pages. Google has to store a cache of the content, transform it into an index of keywords and make it searchable. Publishers that don't want to have the website / book in the index can request that. The difference is that web pages are mostly available for free, while books must be bought. Google Book Search shows only a small number of pages from a book, and doesn't allow copying book content. "Copyrighted books are indexed to create an electronic card catalog and only small portions of the books are shown unless the content owner gives permission to show more," says Google.

What publishers fail to understand is that a book search engine will increase their sales, as people will discover books they wouldn't have found otherwise. The vast collection of human knowledge would be available to anyone interested. The quality of the content is also better than the web's frugal information. The book search engine could also morph into a digital library, that allows you to read, download and print books for a price. Publishers are afraid that Google would undermine their power and would take advantage of their content for free, but so were the webmasters when Google started to crawl the web and slow down their servers.

You can read more about Google Book Search in Washington Post's Google Wants to Digitize Every Book. Publishers Say Read the Fine Print First.

sfasffwesdc

Sunday, August 13, 2006

Google's Digital Library of Alexandria

No comments:

Post a Comment