September 14, 1999, Volume 2 - Number 13
How the 10 Web-site design guidelines apply to the data webhouse
Designing the User Interface
Ralph Kimball
In my last column (The Second Revolution of User Interfaces, Aug. 24), I identified 10 guidelines for user interface design that meet the Webs unique demands. I also promised to apply those guidelines directly to the needs of the data webhouse (as I call the Web-enabled data warehouse). Here are the 10 guidelines:
Near-Instantaneous Performance. Achieving near-instantaneous performance requires a systemwide attack on all the parts of the data webhouse pipeline that can slow performance. You can also do some useful cosmetic procedures that improve perceived performance.
The most effective data-webhouse performance enhancers, starting with the most important:
Choose DBMS software that is purpose-built for query performance. Increasingly, this guideline means choosing a dimensionally-aware DBMS that thrives on join-rich queries either on online analytical processing (OLAP) cubes or star-join schemas. Of course, you must build simple dimensional schemas to begin with.
Use your DBMS indexes effectively. Choose a DBMS that lets you index every access path in a multidimensional schema. Some vendors offer separate bitmap indexes on each of a fact tables foreign keys. With eight dimensions, you have eight indexes, but you can combine them to handle any combination of user constraints (the so-called ad hoc attack).
Use aggregations effectively. Aggregations behave like indexes; they consume disk space and require back room administration. You should have an aggregate navigator invoke them, transparently. But, perhaps even more than indexes, an appropriately used aggregation can harvest huge performance gains.
Increase real memory. Real RAM almost always improves performance, because it increases the in-memory working set size and reduces the need to swap data to and from the disk. RAM access is roughly 100 times faster than normal disk access. A big data webhouse engine can often use gigabytes of RAM.
Exploit parallel processing, which can directly speed up many activities in the data webhouse. Once you choose parallel-enabled hardware and software, pursue parallelism at the applications level also. Multi-pass SQL is a technique for decomposing reports and complex comparisons into several separate queries, each of which is simple and fast. In almost all applications of multi-pass SQL, the separate queries are executable in parallel, on the same machine or different ones.
Use progressive disclosure. Design your webhouse interfaces to paint useful content, especially navigation buttons, immediately. Remember. In the data webhouse, everything is a Web page. The user can begin reading the text to understand the content of the page before other items, such as graphic images, finish painting. Large graphic images should be painted progressively at increasing resolutions so that users can recognize them before they finish painting. Information appearing off-screen should paint after all the main useful information on the first screen appears.
Use caching at all levels. Three kinds are useful to the webhouse. Web-page caching works for static pages, the content of which is known in advance. Its objective is to allow a local server on your high-speed network to give you the page rather than reaching across the Web to the original host. Data caching is distinct from page caching; you can think of it as a kind of precomputed query stored for rapid retrieval. Data caching includes the use of aggregations. Report caching is a larger form of data caching and involves more. Full report generation may involve merging data from multiple sources, or perhaps running complex analytic models. If the report can be anticipated in advance, and especially if more than one user will access it, fetching it from a cache always, obviously, improves performance.
Expected Choices. Every Web user interface provides all the natural choices the user expects and makes them immediately visible and recognizable. The designer needs to carefully list all the choices a user might expect opens a particular page. The categories include sets of predictable navigation, application-specific, help, and communication choices.
Navigation choices determine user interface design. The conventions of the Web itself, not the individual Web site. The users perception of the Web is that it is a seamless whole, not a collection of independent media. Basic site navigation must be possible from every Web page in a standard and predictable way. Site navigation buttons include drill-down choices from the current page, a direct link to the home page, major site subject choices, the site map button, the site search box, alternate language versions of the site, and a trouble-resolution button.
Application-specific choices Users expect to navigate available reports to choose the one they want. An explanation of the reports meaning should be available to the user reviewing the choices. Its a good idea to provide an instantly available, precomputed sample of the report, especially if the report itself takes a long time to return to the users screen. If the report does take a long time, it should be emailed to the user, so the user can close the browser window or log off the computer.
Help choices Besides links to tool documentation and frequently asked questions, every webhouse environment needs a metadata interface that allows the user to understand the organizations data assets. The metadata interface should display the names and definitions of all the available data elements in the webhouse. The definitions should be organized into brief introductions, detailed technical derivations, and current extract status reports.
Communication choices webhouse user interfaces need links at the bottom of every window to data webhouse technical support, sponsoring department business support, and higher-level management both in IT and the sponsoring department. However, these communication interfaces must be supported with very responsive follow up. If the user sends an email to one of these functions, an automatic response should arrive within minutes, together with a promise that a real human being will follow up within a specified time frame. Then that follow-up needs to occur as promised.
No Gratuitous Distractions. This guideline perhaps should be phrased make every page view a pleasant experience. There are many techniques you can use to improve the page viewing experience, and a lot of them are issues of good taste. The ones I think are most important include:
Use fonts and colors only to communicate effectively. Typographers learned hundreds of years ago that less is more when it comes to laying out a book, magazine, or newspaper. When the font draws attention to itself, it has failed in its mission to convey the content effortlessly and pleasantly to the reader. Similarly, avoid gimmicks such as blinking objects on the screen and the excessive use of exclamation points.
Simplify your reporting interfaces. The message consistently resounding from the Web is that simplicity wins. The best simple interfaces get exponentially more use. Simple means uncluttered and direct. In some sense the best interface is a nearly blank screen with only two or three buttons in the middle that say, for example, Push me for report #1 and push me for report #2.
Provide convenient capture. The webhouse reporting tools need to provide convenient capture of the results on the screen for use in all the other tools. Selected rows and columns of reports should be selectable and copyable to spreadsheets and documents.
Streamlined Processes. As the webhouse architect, you must design your business processes from the ground up to work seamlessly on the Web. For example:
Work with legacy system designers to architect a seamless application suite with uniform Web interfaces.
Remove barriers to accessing a page. An important page on your Web site should be easy to reach.
Count the clicks and count the windows to judge whether the process is streamlined.
Resume sessions, park reports to work on later.
Build an explicit value chain for reporting and analysis around the application suite, using conformed dimensions and facts.
Provide easy drill-across reporting.
Provide complete report library descriptions and frequently asked questions (FAQs).
Reassurance. Users are more comfortable with a Web site when they can visualize where they are in the process and see that everything is all right. In a linear process that is hard to visualize, the status of the process should be carried along from page to page:
Provide a map of the processes
Provide status and lineage of data
Provide status of running reports
Actively notify users when new data is available or when reports are complete
Timestamp your dimensions and your reports.
Trust. General Web site trust comes from respecting the users privacy and communicating clearly with the user the intended uses of any personal information. Data webhouse trust includes these elements but also implies the information it presents to the webhouses business user is secure.
Implement two-factor security everywhere. Two-factor security involves verifying what you know (a password) and what you possess (a piece of plastic or maybe your thumb).
Track human resource changes for employees and contractors. Ideally, the data webhouse manager works closely and continuously with the human resource manager to make sure these changes of status are reflected in the information access privileges of the webhouse.
Manage information boundaries among employees, contractors, and customers. Much of the data in the webhouse must be carefully partitioned so that only the right people have access to it. A breakdown in the authorization system for accessing data will scare business partners and can easily provoke lawsuits.
Manage webhouse security directly. No one is better qualified to marry the data with the users from a security perspective than the data webhouse manager. The security responsibility for the webhouse must not be ignored, and it must not be handed to a service organization that has no way of understanding the content of the webhouse data or the legitimate access needs of the all the users. A large webhouse environment requires a full-time security manager.
Problem Resolution. The data webhouse needs the same attention to problem resolution as the general Web site.
Allow backtracking and play forward. Backtracking means returning to a previous step in the process, maybe because the user realized that different information was required. Then the process is resumed from that point forward.
Make it easy for users to report errors. Users can be very helpful in reporting quality problems. But it must be a positive and easy action for them. A corollary is to provide a user survey capability that lets users describe how well their needs have been served.
Religiously acknowledge, track, and follow up all user inputs. User inputs will dry up quickly if significant energy is not put into acknowledging the inputs, tracking the status of the inputs, effectively addressing the inputs, and always communicating these steps to the original user who provided the input.
Provide adequate end-user support. The use of a data webhouse, and the use of data in general, requires a significant commitment to direct support. At the beginning of a data webhouse rollout, it is reasonable to plan for one MBA-class support person for every 20 end-users at first, maybe gravitating to one for every 50 end-users eventually.
Communication Hooks. Part of the expected choices I discussed earlier was a set of communication links to key individuals supporting functions behind the Web site. These links included the technical support for help with the Web site or database access, business support for help with report content and the location of data, and management for general questions and information.
International Transparency. In previous articles, I have written about international calendars, time zones, currencies, and address formats. In an intensive reporting environment, the webhouse architect also needs to be concerned with multinational reporting requirements. In multinational organizations there is a complex trade-off between expressing business results in local terms in a single central language or style. The key differences revolve around the choice of dates, times, currencies, languages, and collating sequences. You want the sales manager from Sweden and the sales manager from Spain to have similar-looking reports that they can compare.
Common Denominator Compatibility. The general Web site must try to accommodate multiple browser types, old browser versions, slow phone lines, and tiny displays. Perhaps you have a little more freedom when designing the user interfaces for employees and business partners. It seems reasonable to ask that all participants use the same browser software and that all of this software be at a common revision level. You can also deploy specific ActiveX or JavaBean applets to every user that assists in the analysis or the presentation.
The Web makes user interface more urgent and more important. Increasingly, as we expose our data warehouses through the Web, and make them data webhouses, we are subject to the same user interface pressures. Ultimately, we webhouse designers must attack all 10 of the user interface guidelines. Doing so is certainly a significant effort. But as with many aspects of the Web, we are being swept along. Welcome to the Web
.
Ralph Kimball, Ph.D., co-invented the Star Workstation at Xerox and founder of Red Brick Systems, works as an independent consultant designing large data warehouses. He is the author of The Data Warehouse Toolkit (Wiley, 1996) and the newly published The Data Warehouse Lifecycle Toolkit (Wiley, 1998). You can reach him through his Web page at rkimball.com.