|
No one can know the future, right? But wait. You finish my sentence, tell me what I am going to do. Despite the complex ambiguous nature of the causes of human action, you sort out things about me. But how? With experience you learn the essential issues that differentiate one state as opposed to the another. For example, observation of the risk-averse nature of an otherwise silent poker player tells us this big bet is not a bluff. Most of the information that arrives to us is a nearly opaque amalgam of many sources, like a multiple exposure negative, or a distorting selection of original information, as with what we can see on a fine-print page after its crumpled.
In text, language, archaeological artifacts, and communication over the phone or a portal, only part of something culturally far more complex is available. Much is chopped off when scanning a text base with just a few words or visualizing n dimensions in just two. Even most forms of modeling such as those used in data mining, decision analysis, and online analytic processing (OLAP) software require formalizing with broad statistical generalizations that leave much structure behind.
FIGURE 1 This acoustic energy map by Howard Nusbaum at the University of Chicago shows how in normal speech there are no real spaces between words. Yet, we clearly separate and hear distinct words.
Sometimes the information we receive is nothing more than a thin trace that seems random but actually contains rich meaning. Imagine a baker wearing a miners headlamp aimed at the ceiling. The lights movement on the ceiling would contain information about the bakers actions, but all that information would be squeezed and jumbled into the narrow channel of the light beam.
Many seemingly random events, such as price movements, are like that bakers light trace on the ceiling. Those who can see only the lights path and know little of baking would say the movement is random. But those with the right experience will infer the sources current activities whether the indicator is a bakers headlamp or a companys stock price. Check out A Non-Random Walk Down Wall Street (Princeton University Press, 1999) by Andrew W. Lo of MIT and Craig Mackinlay of U. of Penn for a refutation of the legacy notion that securities move randomly.
This human ability to interpret sketchy explicit evidence is amazing; people can uncover meaning from limited channels in ways that computer methods have not yet solved. We understand rapid speech even though there are no boundaries between words (see Figure 1 ), figure situational meaning, and determine complex relations in a visual field. Computers dont do any of these things well; for instance, they often still cannot determine where one object ends and another starts in a visual field.
As Steven Pinker at MIT notes with respect to human speech:
Syntax and morphology are codes that map multidimensional semantic data structures onto strings of symbols that can be transmitted through a serial interface. Phonetics compresses the units into a signal that transmits then at a rate that exceeds the resolving power of the ear. Conversations in the Cognitive Neurosciences (MIT Press, 1996, ed. Michael S. Gazzaniga)
Yet somehow we are still able to hold conversations and be understood. Speech production is not one-to-one compression of ideas. At many steps in the process a lot of detail is lost. There is no inverse decompressing function. We are doing the equivalent of seeing oranges in orange juice. But how?
Recovering the richness of a traces origin, disambiguating hidden causes, is the computational problem I begin to explore here. To do so, I will give some examples of variables that display some of the key requirements you would want in order to solve these sorts of problems variables that differentiate, add information, disambiguate, demonstrate credibility, and even seem to be modeling gained experience.
Differentiating. Example: Stock and bond co-movement and GNP. The stock and bond markets seem to roughly move together. But if you differentiate between periods of high GNP growth and otherwise, they in fact move oppositely in the former and even more clearly together in the latter. Similarly, in my article Predicting Movement, (Decision Support, Aug. 3, 1999) I showed how one variable, the past trend of a currency, could be used to segregate upcoming rising periods from falling ones.
Adding information from outside the channel of focus. Example: Gestures. David McNeill at the University of Chicago has shown how gestures add disambiguating information to speech. If I say, take this book and put it over there but leave that one untouched, you may not know what to do unless I point as I speak. McNeill argues language starts with mental images and emerges from multiple channels, including speech and gestures, each encoding different information. A man telling a story says, and he put the knife away, while his hand tells that it went in his pocket.
Disambiguating. Example: Pulling apart complex topologies in words and texts. Often relations come across as inordinately complex because of a unidimensional measure of a multidimensional interaction. In my last article, The Solution Engine, (Decision Support Dec. 21, 1999) I indicated that sometimes the meaning of words, texts, and documents seem to have complex topologies. However, this often occurs because we are using a unidimensional measure (such as the Euclidean distance in a text or word vector space) for a multidimensional interaction (such as the complex, overlapping content in differing texts). However, what looks complex may be made of simple components when separated by the right variables. In The bear did bear good tidings, intratextual positioning lets us know the first bear is a noun and the second a verb.
Credibility through multiple lines of evidence. Example: Stock groups. For a sleuth, a scientist, or even the average Joe, perception of an ideas veracity increases with multiple confirming lines of evidence. In forecasting some stock values by groupings of others, I knew the groupings represented some reality when all were correlated with each other this reinforced the belief they all belonged together and were affected by some common economic issue.
Experiencing realties. Example: Decision trees. When analytic results clearly contain nuggets of truth gained through the statistical equivalent of experience, they often can be used to analyze further problems. The mark that youve gained a key concept is its multiple applicability to differing domains. Extending work I outlined in Win-Win Marketing, (Decision Support, Oct. 5, 1999) I used decision trees iteratively. As it turned out, the customers segregated as the type who might want the bill-pay service were a far better class and component in the forecast of who would be interested in a given savings product than the much smaller class of actual bill-pay users were. This closed-loop process seemed to help throughout the analysis.
Plato wrote the parable of the cave as an image for finite human knowing. We are like a man who has all his life been in a cave and all he sees are the shadows of events outside arriving through a mere opening around a bend. As you can now see, we have a chance to emerge from our cave by finding new ways to shed light with ever more sophisticated methods of modeling the subtle complexities of the world around.
Barry Grushkin (bgrushkin@dsslab.com) is a researcher at the DSS Lab (www.dsslab. com) with its founder Erik Thomsen in Cambridge, Mass.
|
|
|
|
| |||||||||||||||||||||||||||||||




















