Saturday, June 16, 2007

The Aleph and the Knowledge Worker

or, The nature of knowledge in the wired world.

[warning -- formless braindump]

Borges' story The Aleph tells of a point at which a person can view the entire state of the world, everything concentrated and simultaneous. Typically, it's a metaphysical horror story with uncanny resonances in both present reality and the future. The Aleph was published in 1949, four years after Vannevar Bush's seminal article As We May Think, which described the Memex, the postwar technocrat's version of The Aleph as a realizable device. Did Borges read Bush, or just intuit a zeitgeist that was blowing through both of them?

Bush's article is an amazing mix of insightful prophecy and some laughably wrong technical predictions (ie, that storage would involve photographic processes). But mostly he got it right. What is really startling, however, is that despite the fact that we have information systems orders of magnitude more capable than he dreamed of, the problem still remains:

There is a growing mountain of research. But there is increased evidence that we are being bogged down today as specialization extends. The investigator is staggered by the findings and conclusions of thousands of other workers' conclusions which he cannot find time to grasp, much less to remember, as they appear. Yet specialization becomes increasingly necessary for progress, and the effort to bridge between disciplines is correspondingly superficial.


He anticipated the primacy of search:

The prime action of use is selection, and here we are halting indeed. There may be millions of fine thoughts, and the account of the experience on which they are based, all encased within stone walls of acceptable architectural form; but if the scholar can get at only one a week by diligent search, his syntheses are not likely to keep up with the current scene.

Selection, in this broad sense, is a stone adze in the hands of a cabinetmaker. Yet, in a narrow sense and in other areas, something has already been done mechanically on selection. The personnel officer of a factory drops a stack of a few thousand employee cards into a selecting machine, sets a code in accordance with an established convention, and produces in a short time a list of all employees who live in Trenton and know Spanish.
And user interfaces:
One can consider rapid selection of this form, and distant projection for other purposes. To be able to key one sheet of a million before an operator in a second or two, with the possibility of then adding notes thereto, is suggestive in many ways. It might even be of use in libraries... One might, for example, speak to a microphone, in the manner described in connection with the speech controlled typewriter, and thus make his selections. It would certainly beat the usual file clerk.

And blogging and search trails and annotation (which the web has still not quite got right):


It affords an immediate step, however, to associative indexing, the basic idea of which is a provision whereby any item may be caused at will to select immediately and automatically another. This is the essential feature of the memex. The process of tying two items together is the important thing.

When the user is building a trail, he names it, inserts the name in his code book, and taps it out on his keyboard. Before him are the two items to be joined, projected onto adjacent viewing positions... The user taps a single key, and the items are permanently joined..

Thereafter, at any time, when one of these items is in view, the other can be instantly recalled merely by tapping a button below the corresponding code space. Moreover, when numerous items have been thus joined together to form a trail, they can be reviewed in turn, rapidly or slowly, by deflecting a lever like that used for turning the pages of a book. It is exactly as though the physical items had been gathered together from widely separated sources and bound together to form a new book. It is more than this, for any item can be joined into numerous trails.


So. Now we are at a point where we have huge amounts of information at our fingertips, and reasonably good ways to search for it, and crude but effective ways to link it together. The combination of internet standards, high-speed access, Google and open access scientific publications has made a new world. We should be in knowledge paradise!

But that's not how it feels. The basic human problem of how to deal with information hasn't gone away, it's just been raised to the nth power. Finding documents is easy, deciding which are worth reading, and in how much detail, is as difficult as ever. Where should effort be focused, and how can one narrow down a global curiosity into a manageable subset?

Look at the typical modern knowledge worker, trying desperately to keep track of all the things they want to know, or ought to know. Maybe you aren't in this boat, but I am -- comes of too much intellectual curiosity. My RSS reader has 300 feeds! Categorized variously -- I read the politcs blogs mostly for entertainment (in that they don't generally spur me to action), the philosophic ones for ideas, the tech ones mostly out of obligation.

Even within tech, it's a lost cause. There are about five different programming languages I'm involved with right now -- do I want to keep up with current developments in them? And what about all the components and toolkits; what's going on with Prototyup or Scriptaculous? Or Lisp, Ruby, Python? Then there is the entire universe of Java language, components, and tools like Eclipse, which is a whole sub-universe unto itself.

There's no way I can be an expert in all of this stuff. Can I be an expert in finding out just the right piece of knowledge i need? Well, there's where Google comes in handy. I've had pretty good luck, the last six months or so, googling for answers to obscure or not-so-obscure tech questions. A couple of issues though:

It gives a big advantage to tools with a large user community. For instance, I've had occasion both to use Oracle and the very nice Virtuoso OpenLink, an open-source database/semantic web platform/middleware/kitchen sink. THe problem is, hardly anyone is uing Virtuoso so my stupid questions do not have a stupid answer available wit a quick Google. Instead, I need to mail the developers and maybe they get back to me in a day or so, or maybe not.

So where was I? Oh yeah, trying to think about Google and Borges and Memexes and focus and what it means to know everything...got distracted...

The point is it's impossible to know everything, and not even that useful. But what should an informational omnivore know? How to use Google effectively, sure....but there seems to be a deeper idea lurking somewhere in there.

The nature of knowing is going to be different in the future. I was trying to get at this when I coined the term "googlectual" -- as search gets incorporated into our thinking, the knowledge in the digital sphere becomes a part of what we know -- sort of. We lack good metaphors and ways for thinking about the relationship between knowledge-in-the-head and knowledge-in-the-world and the increasingly tight coupling between them.

Alright, I'll stop now.

No comments: