Better StorytellingDimensional design techniques bind events into storiesby Jim Stagnitto
Storytelling is a powerful means of sharing knowledge because it simultaneously and elegantly satisfies several innate human needs, including the need for entertainment, sharing the accumulated wisdom of experience, and providing avenues of self-expression. How, you may ask, does storytelling relate to the job of a data warehouse designer? Design techniques can amplify the dimensional data warehouse's ability to tell the stories encased within the millions or billions of events sitting within its subject areas. The designer's challenge is to select dimensional approaches that allow smart, albeit nontechnical, users to look across millions of overlapping customer stories that occur asynchronously to uncover their commonalities and patterns while maintaining the integrity of each individual event. We should aspire to provide this capability using only simple, dimensionally friendly query tools, rather than relying on Ph.D.-level SQL or data-mining techniques. In practice, this goal means adding new dimensional support to the detailed data, allowing users to creatively nudge these out-of-phase events into novel alignments. To illustrate these design techniques, I'll use the example of a Web retailer. The retailer will have a clickstream page-visit subject area in which each fact represents a visitor viewing a page. The techniques I present, however, should be applicable to any subject area in which one event is a part of a larger and interesting sequence of events, or story, such as multi-leg voyages and shipments, airline flights, telephone calls, or network traffic routing. More profoundly, these techniques can also be applied to second-tier subject areas that consolidate multiple customer, prospect, or partner touchpoints, allowing analysts to scrutinize the exchange patterns that lead to changes in customer behavior over time. Trumping TimeCalendar date is a dimension used in almost every report query, but viewing events solely in chronological order can veil predictable sequences of events at lower levels. Here are some dimensional techniques for dragging these microlevel sequences into the light of day, effectively realigning events on the fly, based on their common story milestones. The step dimension, shown in Figure 1, is a spin on Ralph Kimball's bridge table concept (see Ralph's "Help for Hierarchies," in the DBMS archives, Sept. 1998).
Table 1 shows sample rows for a step dimension.The step dimension describes each event's position within a story sequence by explicitly proclaiming: This page visit is associated with session step key 10, which means that it's the fourth page visit of a session consisting of four total page visits and zero page visits from the end.
Users can constrain on and group by the step dimension and quickly identify the page visits belonging to sessions of any length, pages that are session killers (zero steps from the end), or the pages that precede these session killers. This method allows analysts to easily realign page visits by their position within their respective sessions to answer business questions about the beginnings and endings of sessions for any subset of interesting customers, calendar time, and promotions. Including an additional dimension for purchase-path step role-playing allows analysts to easily scrutinize the sequence of page visits that lead to purchases or the placement of something in a shopping cart. Because more than one purchase can occur in a session, the purchase-path step dimension should roll over (start back at step one) for page visits in the session that follow a shopping cart add. Page visits that don't lead to a shopping cart add should point to a "Not Applicable" purchase-path step dimension row. For example, Table 2 shows a session consisting of seven total page visits, with purchases in the third and fifth page visits.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|















