CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise



Within Plain Sight

Information visualization is a boon for one-to-one e-marketing, but be sure you ask the right business questions

By Don Natchtwey



Of the five human senses,sight is the best one for collecting data. Indeed, two-thirds of all data we receive is visual. The cells of the eye are so sensitive that they can respond to a single photon of light. Yet the eye has no ability to interpret or store images; interpretation is a cognitive ability and storage is a function of memory. Thus, visualization is not what we “see,” but rather the ability to associate what we see with our experience. Consider this example: When a human encounters a skunk, he doesn’t have to see the animal in order to know to leave the area. If someone detects a skunk’s scent but sees a rabbit, each time a rabbit appears thereafter, that person will probably leave the area as well. Visualization, then, is a powerful tool to deceive as well as conceive, with the risk of deception decreasing with viewer experience.

Sometimes experience creates imagery so powerful that it controls behavior. Watch professional golfers’ routines before actually hitting the ball: They step behind the ball, survey the landscape, and pick a point where they want the ball to land. They can maintain a mental image of the point without considering the back swing, point of contact, or follow through. The repetition of practice has the ability to form the mental image so that the elements of the task conform to the image and achieve the desired results. Similarly, an architect will begin a project by visualizing the results; the drawings, materials, and labor then conform to that vision while the builders create the structure.

In the age of e-business, data is scattered across the organization, so distinguishing valuable nuggets from extraneous noise is an important process. Data visualization can be a powerful vehicle for achieving that goal — particularly for e-commerce applications — but for the reasons I’ve described, it can also be misleading. The key is to recognize that data visualization is not only a function of storing, accessing, and displaying data properly, but also of knowing which business questions to ask.

First Things First

In the enterprise, understanding your data begins with knowing what is available — no easy task. Data warehouse design determines how data is stored; the data’s structure and the connections among fields and records will determine what knowledge the database can and cannot yield. Furthermore, you must associate key elements of the data properly. Consider the example of an instrument panel in an airplane. The compass, airspeed indicator, altitude indicator, and other operating instruments are prominently located directly in front of the pilot. The battery power, fuel gauge, and other infrastructure instruments are smaller in size and sit at the peripheries of the instrument panel. This design is a logical one because accessing information about battery power won’t help the pilot avoid trees or land the aircraft in turbulence. This example is simplistic, but it makes the point: The database’s design must be logical. Dimensional data modeling, of course, is the generally accepted solution here.

Next, you need access to that data through a decision-support system (DSS). Online analytic processing (OLAP), data mining, and data visualization form the foundation of a DSS. These data access tools can complement each other or work independently: OLAP technology will verify complex, human-generated hypotheses involving multiple dimensions (such as “What were sales relative to plan?”); in data mining, analysts apply artificial intelligence, decision trees, and other statistical techniques to discover patterns and relationships among data records; and data visualization uses charts, graphs, and diagrams to present the information in a way that leverages the human brain’s innate pattern recognition facilities.

Let’s say Worldcruise.com wants to create an online promotion for a Carib-bean vacation package. Worldcruise maintains a data warehouse of its existing customers that it mines for prospects, but the company also wants to attract new customers. It has many Internet-based methods from which to choose. For example, it could develop an affiliate marketing strategy with travel-related Web sites. Targeted banner ads on high-traffic Web sites could also effectively match customers with promotions; Double- Click.com collects clickstream statistics on Web banner promotions to determine the best location for ads.

Worldcruise now has two new channels for delivering customers to its promotion. Its prospects enter the store and leave a trail of activity that may or may not terminate with a sale. In any case, it can capture detailed information and use it for future promotions. If the customer accepts the promotion, he or she must provide specific information to complete the transaction. If not, the merchant may store a cookie appended with a promotion code or banner ID and put it in the data warehouse.

Here’s the challenge: The quantity and randomness of the data from customers who browse but don’t buy make understanding that data — and designing the data warehouse to support differing types of analyses — difficult. However, you can use data visualization techniques to compare the shopping behavior of the buyer to that of the browser. These techniques will kick off the pattern recognition process, helping designers formulate a logical data model that supports OLAP queries and data mining. These methods will let you explore the data for the critical decisions that led one customer to purchase and another customer to pass. When these decision processes are understood, the merchant can use data visualization to perform what-if analysis to see how the results might change when the behavior changes.

Risks and Rewards

Data visualization is an art form in that techniques based on skill and creativity will get the most out of the data. But there are many risks that can deceive the uninitiated. First, it is possible to visualize bad or incomplete data. Remember the person who smelled a skunk but saw a rabbit? That individual is working with bad data and is sure to have poor results when he crosses the path of the real “data source.” For this reason, you must address the risk of collecting unclean data. A periodic audit of the data sources and collection process will minimize this risk. In collecting clickstream data from the Web server, keep in mind that the “clickable” elements on your page will change and that you must give your click logs unique codes and update them. If a data source includes profile information volunteered from your Web visitors, you should establish assumptions about that information’s accuracy. Charting the data is a good first step to test for integrity; if the results are skewed or reveal unexplainable anomalies, it may be time to do an audit.

There is also a risk of displaying good data badly. For example, one common problem occurs when the variables are not properly segmented or incremented. Perhaps Worldcruise wants to understand the age demographics of its travelers. It creates a chart showing purchases by age; the age increments are 16 to 25, 26 to 35, 36 to 50, and 51 to 75. Without looking at it, we would expect the graph to show the highest concentrations in the 36 to 50 and 51 to 75 brackets because the former segment has the highest income and the latter has the most people. But if the question the company needs to answer is, What age group redeems the most offers? We cannot draw an intelligent conclusion from the chart because one segment has a nine-year range, another a 14-year range, and a third a 24-year range.

Another risk is choosing the wrong data set to display. This risk derives from asking the wrong or incomplete question. Let’s say Worldcruise wants to know how to allocate new resources based on the results of its marketing strategy. The question may lead analysts to look at a graph that compares the click-throughs from each channel strategy. Not only would this graph lead to a poor decision, it is the wrong data set for the decision. Rather, the question that the analysts should ask is: “How should we allocate resources based on the yield from each channel strategy?” Worldcruise may have to pay for each click-through. Therefore, a graph that compares redemption percentages (redemptions: click-throughs) would measure performance more effectively.

Choosing the right data set but the wrong instrument is also a risk. For example, businesspeople often use pie charts in data visualization, yet experts consider pie charts bad form in part because such charts represent static percentages as variable geometric dimensions. Representing a data set in a table is always preferable to using a pie chart, regardless of what the OLAP vendors tell you.

A more complex problem can be the imagery itself. The human mind responds differently to different shapes and colors. For example, you wouldn’t create a graph of arctic temperatures using red shapes because red is a warm color. But in a financial environment, red shapes commonly represent negative numbers.

In his classic book The Visual Display of Quantitative Information (Graphics Press, 1992), Edward Tufte dedicates a chapter to graphical integrity. Tufte argues that graphics should offer some uniformity that gives the perceiver a chance of getting the numbers right. Tufte introduces a formula called the “lie factor” that measures the misrepresentation present in a graphic. For example, if you were to use a pie chart to illustrate a 25 percent increase in Internet sales from one period to the next, you would have to ensure the area of the pie chart and the area of each pie segment are exactly 25 percent larger.

Finally, there is what I call the “apples and oranges comparison” risk. This risk is one of statistical significance; it occurs when the display has more than two variables. Think of a time series graph intended to display sales as a function of income and age. The graph has one time-series line representing age and another representing income. It attempts to measure the contribution of income to sales independent of age, and the contribution of age to sales independent of income. The problem is that age also contributes to income, so as age increases, both sales and income increase. Thus, the business user might be led to misunderstand that increased income is driving sales, when in reality, increased age is the culprit.

Data Visualization for E-Commerce

The electronic environment differs from the physical environment in many ways. The available data is more voluminous and more disparate. So while you would expect a more complete picture, in fact the risks of missing critical detail or developing the picture with poor data are even greater.

In the offline world, your organization collects data at checkout or point of sale. Thus, any subsequent basket analysis — which you use to create direct marketing promotions — reflects product performance rather than customer behavior because many customers use cash as their payment method. In the online world, because each and every mouse click exposes customer behavior, you can extend basket analysis to include browser, or clickstream, analysis. Furthermore, you may have information about your customers before they arrive at the store — perhaps they registered with a portal whose members belong to a particular age demographic. Almost everything the customer experiences is available, such as the products that interest them, the time they spend on the Web site, where they come from, and where they go when they leave. You can even measure the interval between the “browse click” and the “buy click.”

Access to this information gives you more opportunities to develop one-to-one marketing programs. The role of visualization here is to discover how the patterns in consumer behavior will translate into a single view of each customer. Consequently, visualization tools should help you drill down and through the data to achieve a granular view of each customer segment. For example, let’s say the sales results for Cellphones.com’s e-promotion shows 60 percent males and 40 percent females. Drilling through the female segment may reveal that 50 percent are full-time homemakers with 2.5 children. By visualizing these results in real time or near real time, you can create a personalized “friends and family” cross-sell promotion targeted exclusively at homemakers when the next wave of visitors check in.

However, the more data there is, the greater the risk of displaying good data poorly. The log files contain every click that a customer makes in your online store — so how do you decide what information to display? What variable should be the dependent variable, and which variable should be the independent variable? For example, if your business question is How can we grow e-commerce sales?, the answer is probably in the products that you aren’t selling. You can look at how frequently customers browse those products but don’t buy them, but be sure to chart the right products. If you operate a bookstore, you don’t want to put mystery novels, cookbooks, and tax planners on the same chart — the cookbook customer segment probably differs from the tax planner segment, which is likely to differ from the mystery novel segment. Creating a chart including all three products would suggest a comparison where none is likely to exist.

Visualizing the Electronic Environment

To summarize, your organization must know what data is available and how to collect and store it. You must provide tools that display the data so that the visual and numeric representations are consistent. You must know the questions that need to be answered so that the tools will reveal the truth quickly and accurately. You must understand your business and IT environment so you can choose the visual representations consistent with that environment. Finally, you must understand the relationships among displayed variables so that you can use independent variables to predict dependent variables, and not other independent variables.

Finally, visualization must reflect the fact that your environment is dynamic; companies now have to analyze information that grows with each mouse click. The customers, merchants, and products haven’t changed much, but the focus has moved from product performance to customer behavior. Stores are not in malls, they’re in URLs; customers don’t drive minivans, they arrive with an IP address. Customers don’t compare products on shelves and in aisles; they compare them among Web sites worldwide. The speed of the Internet has reduced the reporting time to digest business results and the time to effectively respond to customer information. All these changes present challenges and opportunities that will raise the stakes for data visualization.

WITHIN PLAIN SIGHT

Choosing the right instrument for displaying data is critical to maximizing the value of the information. First, you need to choose between a desktop- and a Web-based tool. Desktop tools have the advantage in performance; they can leverage the processing power and graphics capability of the local architecture. Plus, they don’t have the burden of pulling the graphics through the Internet pipe. The disadvantage of the desktop application is that the data sits on the server, and the information needs to travel through the pipe and parse into a form that the application can read. Conversely, Web-based tools have the advantage of drawing all their resources (rendering, processing, and so on) from the server; thus, the graphics can update dynamically inside the user’s browser. The disadvantage, however, is that different views — such as 3D rendering or visual drill downs — are not easy to create there.

Many desktop visualization applications are also Web-enabled. For example, MapInfo Corp.’s MapXtreme is a mapping server for distributing spatial (geographic) information to multiple users via the Web. Users can develop custom applications that incorporate data, graphics, and spatial dimensions, which is important when your customers are all over the country and all you have is an address or ZIP code. MapXtreme will let you create color-coded maps, which you can combine with Web, marketing, and census data to create profiles of your customers and e-customers. (See Figure 1.)

FIGURE 1 MapInfo's MapXtreme.

Visual Mining Inc.’s DecisionControl, from the pure-play Web-based camp, is another tool that may make sense for your e-commerce needs. Decision- Control is an enterprise portal solution that lets users create a highly personalized, browser-based gateway to the enterprise. Returning to our earlier airplane analogy, it lets you customize an instrument panel that you can cross-check against the view outside — through dynamic updates, your enterprise can make realtime adjustments to stay on course. (See Figure 2.)

FIGURE 2 Visual Mining’s DecisionControl.

Let’s say you have a banner ad running on a portal site frequented by females and another on a portal frequented by males. Visual results show a higher rate of click-through activity from the male site but a higher rate of redemptions originating from the female site. What I see is a low return on my banner program unless I take action to reduce the banners on the male site and increase banner impressions on the female site. So as you can see, whether you’re flying an airplane or analyzing Web data, data that’s 15 minutes old is no help.

Visual Insights Inc., already the big kid on the block, recently became bigger with the acquisition of Visible Decisions Inc. Visual Insights markets Advizor/2000, a Microsoft Excel enhancement application designed to simplify the display and analysis of complex multidimensional relationships. The application uses familiar visual metaphors, or perspectives, designed to maximize the users’ ability to discover insights in their data. Advizor combines interactive data visualization components with a visual workspace for assembling customer behavior and other vertical applications. (See Figure 3.)

FIGURE 3 Visual Insights’ Advizor.


One of the advantages of Advizor is its drill-down capability; you can use the mouse to drag and select any portion of a bar chart to get a granular view — visual or tabular — of the selected area. This facility is especially helpful in finding the “sweet spot” in the data or to find data anomalies to filter out of the analysis.


Rate This Article

Comments:

Optional e-mail address:



Don Nachtwey (dnachtwey@e-centives.com) is a product manager for e-centives Inc.



RESOURCES

MapInfo: www.mapinfo.com
Visual Insights: www.visualinsight.com
Visual Mining: www.visualmining.com