Science
Why Google Ngrams F—ing Sucks
It’s harder than you might think to use Google Ngrams to actually chart trends in cultural history — or do “culturomics,” as the Science article authors would have it — because of well-known problems with the data set.
Here, Matthew Battles tries (on more or less a lark) to see some history play out, Bethany Nowviskie spots a trend (maybe true, maybe false), and Sarah Werner flags the problem.
Aw, man — that fhit Seriously Pucks.
You know what would actually be pretty cool, though? If it were easier to go one level deeper and use Ngrams to do Google Instant Regression. You could graph trends against well-known noise (other s-words misread as f) AND other trends — or instantly find similar graphs.
Let’s say the curve of the graph for the f–word in the 1860s is similar to that for other words and phrases — like “ass”* or “confederacy”* — you could correlate language with other language, individual words with stock phrases, and even (using language as an index/proxy) extralinguistic cultural trends or historical events.
Single-variable analysis just doesn’t tell you very much, even on a data set as problematic as print/language. You need systematic data, and better comparison and control capacity between variables, before you can start to do real science.
(* Ignore for the purposes of this example ascribing contemporary historical meanings to these two ambiguous terms.)
Artificial ecologies
You say “artificial ecologies,” and it sounds like you’re talking about zoos or aquariums or biodomes or terraforming or something. But actually, every legal border on a map creates an artificial ecology. Nicola at Edible Geography (following a post from FP Passport) explains:
For example, the antlion surplus in Israel can be traced back to the fact that the Dorcas gazelle is a protected species there, while across the border in Jordan, it can legally be hunted. Jordanian antlions are thus disadvantaged, with fewer gazelles available to serve “as ‘environmental engineers’ of a sort” and to “break the earth’s dry surface,” enabling antlions to dig their funnels.
Meanwhile, the more industrial form of agriculture practised on the Israeli side has encouraged the growth of a red fox population, which makes local gerbils nervous; across the border, Jordan’s nomadic shepherding and traditional farming techniques mean that the red fox is far less common, “so that Jordanian gerbils can allow themselves to be more carefree.”
I’m fascinated by the fact that differing land-use practices, environmental legislation, and agricultural technology on either side of the political border have shaped two distinct and separate ecosystems of out what would otherwise be a shared desert environment.
(Note: sorry for the lack of posts this week. I’m still in hospital — with hopes of a Monday release! — and among its many other sins, the internet here blocks Google. Can’t even tell you the ridiculous workarounds I’ve had to do just to get the links for this post together. Suffice it to say, Yahoo sucks. As does having nearly all of your internet life hosted by a single company whose pages can get firewalled for no good reason.)
The kids are alright
I love this man — more than I loved Carl Sagan or Richard Feynman or Mr Wizard or the detectives on MathNet.
| The Colbert Report | Mon — Thurs 11:30pm / 10:30c | |||
| Neil deGrasse Tyson | ||||
|
||||
I’m glad my children get to have him.
Albert and Kurt

Albert and Kurt, via Nerdboyfriend.
This is my preferred vision of the all-knowing creator figure. He must a) have hair like that, and b) wear a nice unassuming blue sweatshirt.
SERIOUS QUESTION: Would this have been a fun conversation to be in? Like, reflected glow of fame aside, were these guys actually enjoyable to talk to? Any anecdotes or insights?
Hyperlexia
I had never heard of this disorder before:
In hyperlexia, a child spontaneously and precociously masters single-word reading. It can be viewed as a superability, that is, word recognition ability far above expected levels… Hyperlexic children are often fascinated by letters and numbers. They are extremely good at decoding language and thus often become very early readers. Some hyperlexic children learn to spell long words (such as elephant) before they are two and learn to read whole sentences before they turn three. An fMRI study of a single child showed that hyperlexia may be the neurological opposite of dyslexia.[2]
Often, hyperlexic children will have a precocious ability to read but will learn to speak only by rote and heavy repetition, and may also have difficulty learning the rules of language from examples or from trial and error, which may result in social problems… Their language may develop using echolalia, often repeating words and sentences. Often, the child has a large vocabulary and can identify many objects and pictures, but cannot put their language skills to good use. Spontaneous language is lacking and their pragmatic speech is delayed. Hyperlexic children often struggle with Who? What? Where? Why? and How? questions… Social skills often lag tremendously. Hyperlexic children often have far less interest in playing with other children than do their peers.
The thing is, this absolutely and precisely describes me in childhood, especially before the age of 5 or 6. (This is also the typical age when hyperlexic children begin to learn how to interact with others.) It also describes my son — which is how my wife found the description and forwarded it to me.
You walk around your entire life with these stories, these tics, and the entire time, your quirks are really symptoms. It’s a little strange.
Swimming Out Of The Death Spiral
And now for a note on the dark side of printed books: Michael Jensen, Director of Strategic Web Communications for National Academies and National Academies Press, collects and analyzes data about global warming and ecological collapse. At the AAUP meeting in Philadelphia, he presented “Scholarly Publishing in the New Era of Scarcity,” an argument that the combination of financial and environmental necessity compels university presses to move away from printing, shipping, and storing books and towards a digital-driven, open-access model, with print-on-demand and institutional support rounding out the new revenue model.
(I’m posting Part 2 of Jensen’s speech — the part that’s mostly about publishing — here. Watch Part 1 — which is mostly about the environment — if you want to be justly terrified about what’s going to happen to human beings and everything else pretty soon.)
This is one reason I’m kind of happy that we didn’t print a thousand or more copies of New Liberal Arts. We can make print rare, we can get copies straight to readers, we can make print more responsible, but mostly we have to make print count. And — of course — share the information with as many people as possible.
Evolution 2.0 (and 3.0 beta)
This is kind of a cool idea. Let’s say that evolution writ large is only accidentally about the preservation, transmission, and development of living species, but essentially about the preservation, transmission, and development of information. On this view, organisms are just a means to an end, particularly well-adapted couriers for all of this chemical data.
If that’s the case, then maybe there isn’t anything particularly special about the specific form of that data (i.e. DNA) or the way it’s been transmitted in humans (sexual reproduction). That’s just one way of doing things — in nonconscious, nonverbal, or nonhistorical species, genetic transmission, instinct, inherited traditions are the only means you’ve got. But once modern humans arrive on the scene, with all their increasingly sophisticated means of representing information, then Evolution 1.0, internal transmission of information, isn’t the only game in town — you’ve also got Evolution 2.0, characterized by the external transmission of information.
Once you reframe evolution in this way, then you can say that our species’ rate of evolution “over the last ten thousand years, and particularly… over the last three hundred” is actually off the charts.
So the guy who’s arguing this is a physicist named Stephen Hawking. (Maybe you’ve heard of him — he’s awfully smart, and was part of Al Gore’s Vice Presidential Action Rangers.) He also says that our tinkering with evolution ain’t over:
[W]e are now entering a new phase, of what Hawking calls “self designed evolution,” in which we will be able to change and improve our DNA. “At first,” he continues “these changes will be confined to the repair of genetic defects, like cystic fibrosis, and muscular dystrophy. These are controlled by single genes, and so are fairly easy to identify, and correct. Other qualities, such as intelligence, are probably controlled by a large number of genes. It will be much more difficult to find them, and work out the relations between them. Nevertheless, I am sure that during the next century, people will discover how to modify both intelligence, and instincts like aggression.”
If the human race manages to redesign itself, to reduce or eliminate the risk of self-destruction, we will probably reach out to the stars and colonize other planets. But this will be done, Hawking believes, with intelligent machines based on mechanical and electronic components, rather than macromolecules, which could eventually replace DNA based life, just as DNA may have replaced an earlier form of life.
I can’t decide if this is totally anthropocentric, or exactly the opposite. But it’s kind of exciting, isn’t it? I’m evolving the species right now, just by typing this! And so are you, by reading it! And so are Google’s nanobots, by recording all of it in their fifteenth-gen flash brains!
Geeking Out, c. 1990

I love this; Hewlett-Packard is selling an exact copy of its HP-12C financial calculator for the iPhone.
The iPhone version of the HP-12C is a near carbon copy of the actual machine. It not only looks the same, but it actually runs the same code as do the physical calculators. The iPhone version is actually a bit better than just a clone of the original, though, because HP includes a simplified portrait-mode calculator (the 12C is a landscape-mode device). When used in portrait mode, you can use the number keys, along with all the usual math operators and a couple of other functions such as square roots and memory—perfect for those times when you just need a basic calculator.
The real power of the HP-12C is found when you rotate your iPhone to landscape mode; what appears on the screen then is a photographic reproduction of the actual HP-12C calculator, complete with the gold-brown-orange-blue color scheme that made the original so…endearing? Because the app uses the actual calculator’s code, absolutely everything works just like it does on the real calculator.
I used a calculator just like this to win a middle school mathematics competition — in those days, it was called a “Calculator Competition,” because you could (gasp!) use a calculator. There was a school-wide thing, then a regional, and then a state final; it was a whole thing. The state final was the first time I’d ever seen a graphing calculator; that shiz blew my mind.
Language Is A Technology That Restructures Language
Lera Boroditsky has a super-interesting essay at Edge on her work empirically testing the proposition that language structures thought. (Blërg — resisting urge to… blockquote.… sigh.)
So Boroditsky’s got some clever tests, including asking speakers/writers of a different language to arrange pictures chronologically (Roman languages tend to arrange chronology from left to right, Hebrew from right to left, and fascinatingly, the Kuuk Thaayorre in Australia do it from east to west), and testing incidences of adjectives speakers of languages with gendered nouns assign to those nouns — Germans think keys (male) are hard and jagged and bridges are slender and beautiful, where Spanish-speakers (whose gender assignations switch the nouns) correspondingly flip associations.
But… okay, look. I believe in this thesis. But the tests to my mind are not conclusive evidence. Here’s why.
You can’t get into a person’s head.
Is is that simple? It is.
Because (stay with me) all of these tests don’t show that speakers of different language think differently, but that they represent thought differently. The way we write changes the way we talk, and the way we represent thought in space. The way we talk also changes the way we write. And the way we talk changes the way we talk. You don’t have any evidence — at least, any evidence that doesn’t assume the premise — that Germans actually THINK bridges are more graceful or beautiful than Spaniards do — just that they’re more likely to use adjectives with feminine associations with feminine nouns. What this suggests immediately is that language is a complex and interconnected system where terms and kinds group together, and small linguistic changes actually trigger a series of different linguistic associations and values. It DOESN’T immediately prove that language structures thought — understood as something independent from its representation.
Because if language is the vocal and visual representation of concepts, then ALL of Boroditsky’s tests are instances of language. Language structures language. And once you assume unproblematically that language directly represents thought, then you naturally discover that thought and language are inseparable. Which is what was to be shown. But this is logically a tautology — even if its empirical specifics of how that tautology manifests itself are fascinating.
Let me reframe this, then. What I think these experiments show is that in moments where we may think we are simply registering our pure and unmediated experience of the world, we’re really on auto-pilot — language is in fact doing our “thinking” for us. But this kind of not-quite-thinking doesn’t automatically deserve to be called “thought” at all.

Volcano, Meet Cloud; Cloud, Volcano
Um, wow: