The murmur of the snarkmatrix…

Greg Linch § Matching cuts / 2014-09-16 18:18:15
Inque § Matching cuts / 2014-09-05 13:27:23
Gavin Craig § Matching cuts / 2014-08-31 16:33:56
Tim Maly § Sooo / 2014-08-27 01:35:19
Matt § Sooo / 2014-08-25 02:10:30
Tim § Sooo / 2014-08-25 00:49:38
Robin § Sooo / 2014-08-21 20:47:35
Doug § Sooo / 2014-08-21 20:40:50
Tim § Sooo / 2014-08-21 18:23:13
Gavin § Sooo / 2014-08-21 18:10:44

A Palimpsest of Code

I don’t know enough to assert one way or another whether the Google Books ruling is ultimately a “good” or a “bad” decision. What I do know is that it is fascinating.

US District Judge Denny Chin’s decision is, to my mind, far more interesting than a legal ruling has any right to be. I say this because at the core of the legal decision is a mind-twisting idea:

The display of snippets of text for search is similar to the display of thumbnail images of photographs for search or small images of concert posters for reference to past events, as the snippets help users locate books and determine whether they may be of interest. Google Books thus uses words for a different purpose — it uses snippets of text to act as pointers directing users to a broad selection of books.

Similarly, Google Books is also transformative in the sense that it has transformed book text into data for purposes of substantive research, including data mining and text mining in new areas, thereby opening up new fields of research. Words in books are being used in a way they have not been used before. Google Books has created something new in the use of book the frequency of words and trends in their usage provide substantive information.

Google Books does not supersede or supplant books because it is not a tool to be used to read book. Instead, it “adds value to the original” and allows for “the creation of new information, new aesthetics, new insights and understandings.” Hence, the use is transformative.

Think about that: turning text, already one form of “data”, into another form of data is “highly transformative”. It’s a translation of sorts, but if translation is a kind of chemical reaction, then the source material here is both reactant and catalyst, the thing being changed and the thing left untouched.

It’s as if on a printed page you have letters and words functioning one way, and then underneath it, as a kind of a trace, is an  entirely separate code system that contains meaning temporarily invisible to the reader. It’s a palimpsest of code!

Now, here’s what I’m wondering: What are the narrative possibilities of a text that is always operating in two, mutually incompatible but still comprehensible languages — as if a printed page or a section of text always has hovering just above or below a holographic page of code? What might a book look like or do if it put itself forth in two distinct ways, as always both prose and programming language, narrative and the database?

In a sense, we have something like an analogous precedent. You could, if you wanted to, read Ulysses utterly unaware of the sprawling network of references, focusing on the ostensible narrative. “Underneath” — or perhaps more accurately, alongside — is another sign system of meaning, working its way “invisibly” through the text. Diving into that set of code opens and up and eluciates the text in no end of rich, meaningful ways, situating the novel in both its aesthetic and ideological context.

Can we do that with code? What I do not mean is an executable hidden in the margins, opening up a game or a movie about the book. Instead I’m talking about bits of meaning, marked out in inconspicuous ways that only reveal themselves if the interpreter approaches them in the right “language” — patterns, repetitions, cadences, or rhythms, only readable by the machine but full of human possibility.

Imagine a novel about a musician that reveals its off-kilter time signature through measured instances of the word “beat”, or a text about the immigrant experience that ran unseen contradictory interpretations of key moments backwards through a narrative.

Have we been going at the connections between literature and code all wrong? Should we instead be focusing on an interrelationship between the two that is as constitutive as it is invisible, unreadable?

11 comments

Like, you could choose a particular data visualization of the text – there are some good ones about – and write a story together with the output if the viz? So it made both a pretty / impactful / melancholy picture, and a good story.

Kerry says…

Nice post -

The most fascinating bit to me – G Books is fair use because ‘it is not a tool to be used to read books’

So who’s reading it then? I couldn’t help but read the decision as, ‘Google Books is just training material for AI, this isn’t a human concern.’

So in addition to Books being a new way for us to read, in a pattern-based way – it’s also a new way to write. Authors always have an idea of their audience in mind, but it’s now pertinent to wonder, “What will Google think?”

I’m curious about the idea of writing “for” Google Books. If Google Books does develop into a foundation for AI, or for a new way to catalogue the world’s knowledge, could an author decide that there is a new idea that they feel needs to be added to the collection. Maybe 99% of the knowledge needed is contained in previous texts, but they have an extra 1% that can make it more relevant?

Maybe Google Books is a play towards Wikipedia. Instead of asking everyone to contribute to one centralized service, there is a collection process from an infinite number of texts.

Delayed response, but yeah – this is the double-edged sword of this, isn’t it? If I’m (very vaguely) proposing machine-readable patterns in texts, then which “reading algorithms” get prioritized?

It’s a tricky question, because you could ask the same question about interpretive lenses or cultural frameworks – if I write this novel in a way so that, for example, British-Chinese immigrants will get most out of the references, am I missing something/excluding people etc etc.

It reminds me of the early days of writing for the web, when Suck.com was playing with the idea of links being less literal. Instead of just using links to define terms or link to relevant sites, they could be used to add color or humor to a story.

What is amazing within the Google decision is that the judge is making the case for books themselves being a creative work that is built off of the building blocks of ideas, words, and data. The work that Google is doing is breaking the mortar that is binding those bricks, allowing themselves (and ultimately others) to build something fresh with them.

Writers have been doing that work manually since the beginning of time. But it is interesting to see computers finally get their go at it

“Writers have been doing that work manually since the beginning of time. But it is interesting to see computers finally get their go at it”

Oh, that’s fascinating – that it’s the rearrangement of data “against” the author’s intentions that the computer will be best at.

I guess the question then becomes what are the units of meaning the computer rearranges? Is it randomized, or is it sections of the text or… what happens if/when computers can make some sense of narrative?

Ahhh this is so good:

Imagine a novel about a musician that reveals its off-kilter time signature through measured instances of the word “beat”, or a text about the immigrant experience that ran unseen contradictory interpretations of key moments backwards through a narrative.

SO GOOD.

This is one of those moments when I wish I had something more substantive to add — something more than enthusiasm & yes-yes-more-more-ism, which I will attempt to communicate with the following series of exclamation marks: !!!!!!!!!!!!!!!!!! ! !!!!!!!

Those are some excellent ideas, and there is a long tradition behind them, not just in the Joycean mode, either. There was a whole cottage industry of literature in the 18th and 19th Century of works written with goals like that in mind (most of it unpublished or published but ultimately unreadable). Book-length palindromes, novels structured like trees, etc. They generally proved as difficult to read and enjoy (beyond the little tickle one gets intellectually from seeing such a thing) as they did to write.

I dream of expanding experimentation in ways like this (patterns specifically for machines!), but the older I get and the more I read the more it becomes clear to me that there are good reasons, for readers and writers both, that variations of Victorian realism continue to dominate English-language long-form prose.

Pro tip: in communicating exuberance, !!1!111!!!11 >> !!!!!!!!

A few thoughts, in random-access order:

1) Read Whitney Trettien’s “A Deep History of Electronic Textuality: The Case of English Reprints Jhon Milton Areopagitica.”

2) I was also struck by the Google Books’ decision’s language, and maybe especially in the sections Nav singles out. I mean, “words for a different purpose”! It’s mesmerizing.

3) The fundamental insight that made digital computing possible was the realization that it was all a matter of manipulating symbols to get new symbols, that numbers could be commands as easily as they could be data, and that there was fundamentally no inherent distinction between the two. Similarly, there are words and there are words*, words that mean and words that point, words for men and words for their machines.

4) And don’t we already know this? Isn’t the blurring of these two already apparent, whenever we insert a bit of markup or unix code into the body text and we malform a tag and the site executes a command we didn’t intend? (I made all of kottke.org blink once.)

5) We know that there are words that DO things. We write books about it, like JL Austen’s How to Do Things With Words. We have street signs. We have SEO. You can buy keywords, and some keywords, like some street addresses, are very valuable indeed.

6) We know that words can be made to generate words, words point to words, that they can be transformed and mutated and reassembled. We know how to use hashtags and HTML markup in every register, ironic and serious.

7) We know that the only thing that is inevitable is literacy, and that we are on our second if not our third dose of it.

8) We know that Google doesn’t understand the words they have, and doesn’t much seem to care.

And what does THAT mean, ultimately — that the company that’s mined this mighty deposit of text does not seem to care about its flaws, is willing to treat them as a rounding error — that Google does not mind that there are bugs in its code?

Yes to… all of that! But your last point – I think this is why it bugs me that it’s Google, and not a public (in whatever form) group/entity that is in charge of something like this. I mean, it doesn’t have to be exclusive, but right now, Google is not only the sole company doing this, it’s also probably one of the few entities* anywhere that has the capability. So I wish that a library or less-bad Gutenberg Project-esque thing would take up this mantle of digitizing books, now that we can (tentatively) call it fair use.

*am I alone in that the word entity always makes me think of the crystalline entity from ST:TNG?

The snarkmatrix awaits you

Below, you can use basic HTML tags and/or Markdown syntax.