Monday, November 05, 2007

Grafton on Digitization

Biblio-historian Anthony Grafton has an essay on digitization in the New Yorker which is today's required reading. I think he's got it just about right:

"Google’s projects, together with rival initiatives by Microsoft and Amazon, have elicited millenarian prophecies about the possibilities of digitized knowledge and the end of the book as we know it. ... Others have evoked even more utopian prospects, such as a universal archive that will contain not only all books and articles but all documents anywhere—the basis for a total history of the human race.

In fact, the Internet will not bring us a universal library, much less an encyclopedic record of human experience. None of the firms now engaged in digitization projects claim that it will create anything of the kind. The hype and rhetoric make it hard to grasp what Google and Microsoft and their partner libraries are actually doing. We have clearly reached a new point in the history of text production. On many fronts, traditional periodicals and books are making way for blogs and other electronic formats. But magazines and books still sell a lot of copies. The rush to digitize the written record is one of a number of critical moments in the long saga of our drive to accumulate, store, and retrieve information efficiently. It will result not in the infotopia that the prophets conjure up but in one in a long series of new information ecologies, all of them challenging, in which readers, writers, and producers of text have learned to survive."

Grafton provides a short history of 'information management', beginning with Mesopotamian tablet-cataloging and the Alexandrian scroll-copiers before discussing the changes wrought by the arrival of print technologies and those that have arrived since. He succinctly and accurately describes the shortcomings of all the current book digitization efforts - Google, Microsoft, Amazon, &c. - and also covers something that most who write on digitization leave out entirely: the place for non-Western books and collections in the grand plan. Grafton notes "Sixty million Britons have a hundred and sixteen million public-library books at their disposal, while more than 1.1 billion Indians have only thirty-six million. Poverty, in other words, is embodied in lack of print as well as in lack of food. The Internet will do much to redress this imbalance, by providing Western books for non-Western readers. What it will do for non-Western books is less clear."

Grafton calls the idea of a universal archive "distant." I'd go a step further and call it utterly ludicrous. "ArchivesUSA, a Web-based guide to American archives, lists five and a half thousand repositories and more than a hundred and sixty thousand collections of primary source material. The U.S. National Archives alone contain some nine billion items. It’s not likely that we’ll see the whole archives of the United States or any other developed nation online in the immediate future - much less those of poorer nations."

"The supposed universal library, then, will be not a seamless mass of books, easily linked and studied together, but a patchwork of interfaces and databases, some open to anyone with a computer and WiFi, others closed to those without access or money. The real challenge now is how to chart the tectonic plates of information that are crashing into one another and then to learn to navigate the new landscapes they are creating. Over time, as more of this material emerges from copyright protection, we’ll be able to learn things about our culture that we could never have known previously. Soon, the present will become overwhelmingly accessible, but a great deal of older material may never coalesce into a single database. Neither Google nor anyone else will fuse the proprietary databases of early books and the local systems created by individual archives into one accessible store of information. Though the distant past will be more available, in a technical sense, than ever before, once it is captured and preserved as a vast, disjointed mosaic it may recede ever more rapidly from our collective attention."

Grafton concludes on precisely the right note, in my admittedly biased view: "And yet we will still need our libraries and archives. ... Original documents reward us for taking the trouble to find them by telling us things that no image can. Duguid describes watching a fellow-historian systematically sniff two-hundred-and-fifty-year-old letters in an archive. By detecting the smell of vinegar - which had been sprinkled, in the eighteenth century, on letters from towns struck by cholera, in the hope of disinfecting them - he could trace the history of disease outbreaks. Historians of the book - a new and growing tribe - read books as scouts read trails. Bindings, usually custom-made in the early centuries of printing, can tell you who owned them and what level of society they belonged to. Marginal annotations, which abounded in the centuries when readers usually went through books with pen in hand, identify the often surprising messages that individuals have found as they read. ...

For now and for the foreseeable future, any serious reader will have to know how to travel down two very different roads simultaneously. No one should avoid the broad, smooth, and open road that leads through the screen. But if you want to know what one of Coleridge’s annotated books or an early “Spider-Man” comic really looks and feels like, or if you just want to read one of those millions of books which are being digitized, you still have to do it the old way, and you will have to for decades to come. ... If you want deeper, more local knowledge, you will have to take the narrower path that leads between the lions and up the stairs. There - as in great libraries around the world - you’ll use all the new sources, the library’s and those it buys from others, all the time. ... [T]hese streams of data, rich as they are, will illuminate, rather than eliminate, books and prints and manuscripts that only the library can put in front of you. The narrow path still leads, as it must, to crowded public rooms where the sunlight gleams on varnished tables, and knowledge is embodied in millions of dusty, crumbling, smelly, irreplaceable documents and books."

Illuminate, rather than eliminate. I like it.