This post from Adrian Holovaty on Google Street View and driverless cars basically blew my mind. Keep in mind I just wrote a long-ish post about Google’s computer vision ambitions, so I knew some of this stuff already. Even so, the core insight that Adrian shares here is astonishing.
It’s harder than you might think to use Google Ngrams to actually chart trends in cultural history — or do “culturomics,” as the Science article authors would have it — because of well-known problems with the data set.
Here, Matthew Battles tries (on more or less a lark) to see some history play out, Bethany Nowviskie spots a trend (maybe true, maybe false), and Sarah Werner flags the problem.
Aw, man — that fhit Seriously Pucks.
You know what would actually be pretty cool, though? If it were easier to go one level deeper and use Ngrams to do Google Instant Regression. You could graph trends against well-known noise (other s-words misread as f) AND other trends — or instantly find similar graphs.
Let’s say the curve of the graph for the f–word in the 1860s is similar to that for other words and phrases — like “ass”* or “confederacy”* — you could correlate language with other language, individual words with stock phrases, and even (using language as an index/proxy) extralinguistic cultural trends or historical events.
Single-variable analysis just doesn’t tell you very much, even on a data set as problematic as print/language. You need systematic data, and better comparison and control capacity between variables, before you can start to do real science.
(* Ignore for the purposes of this example ascribing contemporary historical meanings to these two ambiguous terms.)
Here are three disjoint thoughts, slightly too long for tweets/comments:
- Part of Lanier’s critique of Wikileaks works astonishingly well as a critique of Google’s Ngrams, too. (I’m working up a longer post on this.) In particular, I’m thinking of this observation:
A sufficiently copious flood of data creates an illusion of omniscience, and that illusion can make you stupid. Another way to put this is that a lot of information made available over the internet encourages players to think as if they had a God’s eye view, looking down on the whole system.
- I feel like we need a corollary to the Ad Hitlerem/Godwin’s Law fallacy. I’m going to call it “the Gandhi principle.” Just like trotting out the Hitler analogy for everything you disagree with shuts down a conversation by overkill, so do comparisons with Mahatma Gandhi, Martin Luther King, Nelson Mandela, Jesus, and other secular and not-so-secular activist saints.
We’ve canonized these guys, to the point where 1) we think they did everything themselves, 2) they never used different strategies, 3) they never made mistakes, and 4) disagreeing with them then or now violates a deep moral law.
More importantly, in comparison, every other kind of activism is destined to fall short. Lanier’s essay, like Malcolm Gladwell’s earlier essay on digital activism, violates the Gandhi principle. (Hmm, maybe this should be the No-Gandhi Principle. Or it doesn’t violate the Gandhi Principle, but invokes it. Which is usually a bad thing. Still sorting this part out.) The point is, both Ad Hitlerem and the Gandhi Principle opt for terminal purity over differential diagnosis. If you’re not bringing it MLK-style, you’re not really doing anything.
The irony is, Lanier’s essay is actually pretty strong at avoiding the terminal purity problem in other places — i.e., if you agree with someone’s politics, you should agree with (or ignore) their tactics, or vice versa. At its best, it brings the nuance, rather than washing it out.
Google’s Ngrams is also subject to terminal purity arguments — either it’s exposing our fundamental cultural DNA, or it’s dicking around with badly-OCRed data, and it couldn’t possibly be anything in between. To which I say — oy.
Yesterday NiemanLab published some of my musings on the coming “Speakularity” — the moment when automatic speech transcription becomes fast, free and decent.
I probably should have underscored the fact that I don’t see this moment happening in 2011, given the fact that these musings were solicited as part of a NiemanLab series called “Predictions for Journalism 2011.” Instead, I think several things possibly could converge next year that would bring the Speakularity a lot closer. This is pure hypothesis and conjecture, but I’m putting this out there because I think there’s a small chance that talking about these possibilities publicly might actually make them more likely.
First, let’s take a clear-eyed look at where we are, in the most optimistic scenario. Watch the first minute-and-a-half or so of this video interview with Clay Shirky. Make sure you turn closed-captioning on, and set it to transcribe the audio. Here’s my best rendering of some of Shirky’s comments alongside my best rendering of the auto-caption:
|Manual transcript:||Auto transcript:|
|Well, they offered this penalty-free checking account to college students for the obvious reason students could run up an overdraft and not suffer. And so they got thousands of customers. And then when the students were spread around during the summer, they reneged on the deal. And so HSBC assumed they could change this policy and have the students not react because the students were just hopelessly disperse. So a guy named Wes Streeting (sp?) puts up a page on Facebook, which HSBC had not been counting on. And the Facebook site became the source of such a large and prolonged protest among thousands and thousands of people that within a few weeks, HSBC had to back down again. So that was one of the early examples of a managed organization like a bank running into the fact that its users and its customers are not just atomized, disconnected people. They can actually come together and act as a group now, because we’ve got these platforms that allow us to coordinate with one another.||will they offer the penalty-free technique at the college students pretty obvious resistance could could %uh run a program not suffer as they got thousands of customers and then when the students were spread around during the summer they were spread over the summer the reneged on the day and to hsbc assumed that they could change this policy and have the students not react because the students were just hopeless experts so again in western parts of the page on face book which hsbc had not been counting on the face book site became the source of such a large and prolonged protest among thousands and thousands of people that within a few weeks hsbc had to back down again so that was one of the early examples are female issue organization like a bank running into the fact that it’s users are not just after its customers are not just adam eyes turned disconnected people they get actually come together and act as a group mail because we’ve got these platforms to laos to coordinate|
Cringe-inducing, right? What little punctuation exists is in error (“it’s users”), there’s no capitalization, “atomized” has become “adam eyes,” “platforms that allow us” are now “platforms to laos,” and HSBC is suddenly an example of a “female issue organization,” whatever that means.
Now imagine, for a moment, that you’re a journalist. You click a button to send this video to Google Transcribe, where it appears in an interface somewhat resembling the New York Times’ DebateViewer. Highlight a passage in the text, and it will instantly loop the corresponding section of video, while you type in a more accurate transcription of the passage.
That advancement alone — quite achievable with existing technology — would speed our ability to transcribe a clip like this quite a bit. And it wouldn’t be much more of an encroachment than Google has already made into the field of automatic transcription. All of this, I suspect, could happen in 2011.
Now allow me a brief tangent. One of the predictions I considered submitting for NiemanLab’s series was that Facebook would unveil a dramatically enhanced Facebook Videos in 2011, integrating video into the core functionality of the site the way Photos have been, instead of making it an application. I suspect this would increase adoption, and we’d see more people getting tagged in videos. And Google might counter by adding social tagging capabilities to YouTube, the way they have with Picasa. This would mean that in some cases, Google would know who appeared in a video, and possibly know who was speaking.
Back to Google. This week, the Google Mobile team announced that they’ve built personalized voice recognition into Android. If you turn it on for your Android device, it’ll learn your voice, improving the accuracy of the software the way dictation programs such as Dragon do now.
Pair these ideas and fast-forward a bit. Google asks YouTube users whether they want to enable personalized voice recognition on videos they’re tagged in. If Google knows you’re speaking in a video, it uses what it knows about your voice to make your part of the transcription more accurate. (And hey, let’s throw in that they’ve enabled social tagging at the transcript level, so it can make educated guesses about who’s saying what in a video.)
A bit further on: Footage for most national news shows is regularly uploaded to YouTube, and this footage tends to feature a familiar blend of voices. If they were somewhat reliably tagged, and Google could begin learning their voices, automatic transcriptions for these shows could become decently accurate out of the box. That gets us to the democratized Daily Show scenario.
This is a bucketload of hypotheticals, and I’m highly pessimistic Google could make its various software layers work together this seamlessly anytime soon, but are you starting to see the path I’m drawing here?
And at this point, I’m talking about fairly mainstream applications. The launch of Google Transcribe alone would be a big step forward for journalists, driving down the costs of transcription for news applications a good amount.
Commenter Patrick at NiemanLab mentioned that the speech recognition industry will do everything in its power to prevent Google from releasing anything like Transcribe anytime soon. I agree, but I think speech transcription might be a smaller industry economically than GPS navigation,* and that didn’t prevent Google from solidly disrupting that universe with Google Navigate.
I’m stepping way out on a limb in all of this, it should be emphasized. I know very little about the technological or market realities of speech recognition. I think I know the news world well enough to know how valuable these things would be, and I think I have a sense of what might be feasible soon. But as Tim said on Twitter, “the Speakularity is a lot like the Singularity in that it’s a kind of ever-retreating target.”
The thing I’m surprised not many people have made hay with is the dystopian part of this vision. The Singularity has its gray goo, and the Speakularity has some pretty sinister implications as well. Does the vision I paint above up the creep factor for anyone?
* To make that guess, I’m extrapolating from the size of the call center recording systems market, which is projected to hit $1.24 billion by 2015. It’s only one segment of the industry, but I suspect it’s a hefty piece (15%? 20%?) of that pie. GPS, on the other hand, is slated to be a $70 billion market by 2013.
I’m not going to recount the long insomniac thought trail that led me here, but suffice it to say I ended up thinking about mission statements early this morning. Google’s came immediately to mind: To organize the world’s information, and make it universally accessible and searchable. I’m not sure what Twitter’s mission statement might be, but a benign one didn’t take too long to present itself: To enable a layer of concise observations on top of the world. (Wordsmiths, have at that one.)
I got completely stuck trying to think of a mission for Facebook that didn’t sound like vaguely malevolent marketing b.s. To make everything about you public? To connect you with everyone you know?
When I read Zadie Smith’s essay as an indictment of Facebook — its values, its defaults, and its tendencies — rather than the “generation” it defines, her criticisms suddenly seem a lot more cogent to me. I realized that I actually am quite ambivalent about Facebook. I thought it was worth exploring why.
I was thinking about the ways social software has changed my experience of the world. The first world-altering technology my mind summoned was Google Maps (especially its mobile manifestation), and at the thought of it, all the pleasure centers of my brain instantly lit up. Google Maps, of course, has its problems, errors, frustrating defaults, troubling implications — but these seem so far outweighed by the delights and advantages it’s delivered over the years that I can unequivocally state I love this software.
I recently had an exchange with my friend Wes about whether Google Maps, by making it so difficult to lose your way, also made it difficult to stumble into serendipity. I walked away thinking that what Google Maps enabled — the expectation that I can just leave my house, walk or drive, and search for anything I could want as I go — enabled much more serendipity than it forestalled. It’s eliminated most of the difficulties that might have prevented me from wandering through neighborhoods in DC, running around San Francisco, road-tripping across New England. And it demands very little of me, and imposes very little upon me. (One imposition, for example: All the buildings I’ve lived in have been photographed on Street View. I’m happy to abide by this invasion of privacy, because without it, I wouldn’t have found the place I live in today.) For me, Google Maps is basically an unalloyed social good.
Google has been very prolific with these sorts of products — things that bring me overwhelming usefulness with much less tangible concern. Google Search itself is, of course, a masterpiece. News Search, Gmail, Reader, Docs, Chrome, Android, Voice — even failed experiments such as Wave — I find that these things have heightened what I expect software to do for me. They have made the Internet more useful, information more accessible, and generally, life more pleasurable.
I was trying to think of a Facebook product that ameliorated my life in some similar way, and the first thing to come to mind was Photos. Facebook Photos created for me the expectation that every snapshot, every captured moment, would be shared and tagged for later retrieval. At my fifth college reunion, I made a point of taking photos with every classmate I wanted to reconnect with on Facebook. When I go home and tag my photos, I told my buddies, it will remind you that we should catch up. And it worked like a charm! I reconnected with dozens of old friends on Facebook, and now I see their updates scrolling by regularly, each one producing a tinge of warmth and good feelings.
But the dark side of Facebook Photos almost immediately presented itself as well. For me, the service has replaced the notion of a photograph as a shared, treasured moment with the reality of a photograph as a public event. I realized all of a sudden that I can’t remember the last time I took a candid photo. Look through my photos, and even those moments you might call “candid” are actually posed. I can’t sit for a picture without expecting that the photo will be publicized. Not merely made public — my public Flickr stream never provoked this sense — publicized. And although this is merely a default, easily overridden, to do so often feels like an overreaction. To go to a friend’s photo of me and untag myself, or to make myself untaggable, feels like I’m basically negating the purpose of Facebook Photos. The product exists so these images might be publicized. And increasingly, Facebook seems to be what photos are for.
Of course that’s not true. I also suddenly realized that I’ve been quietly stowing away a secret cache of images on my phone — a shot of Bryan sleeping, our cat Otis in a grocery bag, an early-morning sunlit sky — that are quickly becoming the most treasured images I possess, the ones I return to again and again.
Perhaps Facebook Photos has made my private treasure trove more valuable.
I use Facebook Photos as an example first because it’s the part of the service that’s most significantly altered my experience of the world, but also because I think it reflects something about the software’s ethos. That dumb, relentless publicness of photos on Facebook doesn’t have to be the default. Photos, by default, could be accessible only to users tagged in a set, for example, not publicized to all my friends and their friends. I’m not even sure that’s an option. (My privacy settings allow most users to see only my photos, not photos I’m tagged in. But I’m not sure what that even means. When another friend shares a photo publicly, and I’m tagged in it, I’m fairly certain our friends see that information.)
Facebook engineered the photo-sharing system in such a way as to maximize exposure rather than, say, utility. For Facebook, possibly, exposure is utility.* I think that characterizes most of the choices that underpin Facebook’s products. With most of the other social software products I use — the Google suite, WordPress, Twitter, Flickr, Dropbox, etc. — I am constantly aware of and grateful for the many ways the software is serving me. With Facebook, I’m persistently reminded that I am always serving it — feeding an endless stream of information to the insatiable hive, creating the world’s most perfect consumer profile of myself.
I don’t trust Google for a second, but I value it immensely. I trust Facebook less, and I’m growing more ambivalent about its value.
I don’t think I want to give up Facebook. I value the connections it offers, however shallow they are. I enjoy looking at photos of my friends. I like knowing people’s birthdays.
But I am wary of it, its values and its defaults. How it’s changing my expectations and my experience of the world.
* Thought added post-publication.
I am loving James Grimmelman’s nuanced notes on the latest Google Books Settlement hearing—both because I’m interested in the issue and because it’s interesting to understand the actual legal process better. Okay, also because James throws in stuff like this:
Judge Chin, however, threw a curveball, asking how Rubin would respond to Sony’s arguments about competition. The substance of the question was squarely in line with the issues Rubin was arguing, but I think the unexpected form it took, like the Stay-Puft Marshmallow Man, caught him off guard.
This is my kinda court reporting.