I think, for folks interested in what’s happening with digital books, at this point it’s foundational to read Anthony Grafton’s 2007 New Yorker essay on book digitization. Grafton is a historian of Renaissance humanism and early print culture; he writes with a great deal of sympathy even as he criticizes a lot of the ramshackle moves that have been made in getting print books up on the web.
It’s a weird thing – I think I can say that age of digital humanism we’re in shows the same enthusiasm as the Renaissance in getting old texts into circulation and generating new information, but much less care than the early humanists in making sure that the information is complete, accurate, or discriminating. And it seems as though this is what traditionalists and futurists argue about, endlessly.
This at least, is the tension in Peter Green’s TLS review of Grafton’s new book, Worlds Made By Words, which contains an expanded version of that New Yorker essay, plus plenty of tasty goodness about Renaissance humanists like Leon Battista Alberti, or Justius Lipsius, a Flemish philologist who “offered to recite the text of Tacitus with a knife held to his throat, to be plunged in if he made a mistake.” Green’s review is titled “Google Books or Great Books,” and it offers a nice peek into what Grafton’s all about. Here’s a slice of the good:
An editor at Cambridge University Press, reputedly the world’s oldest publisher, cheerfully admitted to Grafton that, conservatively, “95 percent of all scholarly enquiries start at Google”. Which, as Grafton says, “makes sense: Google, the nerdiest of corporations, has roots in the world of books”, to the point where (if you throw in Amazon and one or two others) “the Web has become a vast and vivid online bookstore”… Today all would-be members of the Republic of Letters, all hopeful explorers of past history, have, in a literal sense, the world at their fingertips. As Grafton says, “it is more than transformative to sit in your office at a small liberal arts or community college and call up, as you already can, thousands of books in dozens of languages, the nearest material copy of which is hundreds of miles away”.
And the bad:
Scanning by optical character recognition, ironically, commits some of the same errors as those made by careless medieval scribes, including long “s” read as “f” (German scholarship sometimes appears as Wiffenschaft), and the confusion of u and n. Thus, key in the meaningless qnalitas for qualitas (a key term in medieval philosophy) and you get over 600 hits for qualitas which you would miss if you only keyed in the correct word. Much of the old German spiky Gothic black-letter material (Fraktur) comes out in “plain text” as gobbledegook.
Which Grafton synthesizes in a really lovely way, as follows:
Yes, the young scholar is told, take every advantage of the new electronic Aladdin’s cave. But – and here Grafton shows a rare moment of deeply felt emotion – these streams of data, rich as they are, will illuminate rather than eliminate the unique books and prints and manuscripts that only the library can put in front of you. For now, and for the foreseeable future, if you want to piece together the richest possible mosaic of documents and texts and images, you will have to do it in those crowded public rooms where sunlight gleams on varnished tables, as it has for more than a century, and knowledge is still embodied in millions of dusty, crumbling, smelly, irreplaceable manuscripts and books.
Here’s a thought. One term that I think we can use to bring the digital enthusiasts and the traditional scholars – besides humanism, which I still think is a super-powerful idea – is standards. One thing web and software people are actually surprisingly good at, considering the libertarian ethos that drives a lot of the best work, is at establishing standards of mutual interoperability.
What if scholarly bodies like the Modern Languages Association, American Library Association, American Historical Association, etc., worked together with the tech guys to establish standards for digital scholarly texts in their fields? Work to verify the scans, establish the bibliographies (it would really help to know, for example, if a full-preview book in Google Books is actually from a pirated or faulty edition), and verify the results? Hashtags for scanned books!
I think that could be beautiful.