The murmur of the snarkmatrix…

August § The Common Test / 2016-02-16 21:04:46
Robin § Unforgotten / 2016-01-08 21:19:16
MsFitNZ § Towards A Theory of Secondary Literacy / 2015-11-03 21:23:21
Jon Schultz § Bless the toolmakers / 2015-05-04 18:39:56
Jon Schultz § Bless the toolmakers / 2015-05-04 16:32:50
Matt § A leaky rocketship / 2014-11-05 01:49:12
Greg Linch § A leaky rocketship / 2014-11-04 18:05:52
Robin § A leaky rocketship / 2014-11-04 05:11:02
P. Renaud § A leaky rocketship / 2014-11-04 04:13:09
Bob Stepno § The structure of journalism today / 2014-03-10 18:42:32

Dull Alexandrias

This is supposedly a list of the ten biggest databases in the world. But I am suspicious: I really feel like the U.S. federal government ought to rate more of those top spots. What about Social Security? Or some sort of crazy Medicare database?

Also, could YouTube’s database really be larger than, say, Visa’s?

Anyway, I’m still linking to it just because I love the idea of Really Huge Databases. Any other contenders you can think of?

February 16, 2007 / Uncategorized


“I love the idea of Really Huge Databases.”

Now THAT, my dear young friend, is truly odd 🙂


P.S. Now that I think about it, I kind of love it, too.

They are measuring by the size of the database, so I wouldn’t be surprised at YouTube’s inclusion. Video takes up a lot more space than lists of names and numbers.

The same could be said of the phone companies, though, and some of their DBs merit inclusion on this list. I guess the average person generates more phone calls than credit card transactions, though… fair enough.

The one that I thought was missing was Wal-Mart, which I once heard had a DB as large as Google (though probably not true anymore).

Not that I’m biased, but …

If “In terms of internet databases, Google is king”, then why is it #4? Huh? Huh?

Dan says…

I think your suspicion is well-founded, Robin.

PoN suggests that the author’s based this list purely on size of the database (in bytes), but there is no way to confirm this and there is not even a hint at methodology. So I’d say a more apt title would have been : 10 really big databases, from a wide variety of fields.

The more important question would be, though: why should a list like this matter? What do we see differently for having read this list? If nothing else, its a good guide to where to go to find expertise on maintaining large databases, and perhaps a guide to places that we would expect to be driving innovation in database management.

I will make one other comment: the #2 spot on the list goes to the National Energy Research Scientific Computing Center. While I’m sure that the NERSC does in fact do lots of interesting basic science computing work, the fact that almost half of its users are associated with DOE labs, coupled with what I learned working for a little while in an inertial confinement fusion lab, suggests that an awful lot of the science that NERSC is doing is pitched as a means of maintaining the United States’ thermonuclear arsenal (the stockpile stewardship program). Glance at the DOE budget sometime and note how much of it is spent on nuclear weapon maintenance or clean-up from past nuclear production—the modern DOE is still dominated by what used to be called the Atomic Energy Commission. I offer this only for perspective: this massive database may not just exist b/c these federal scientists want to know about the big bang.

The snarkmatrix awaits you

Below, you can use basic HTML tags and/or Markdown syntax.