The Soul of the Data Warehouse, Part One: Drilling DownDrilling down just means "show me more detail"Although data warehouses come in many shapes and sizes and deal with many different subject areas, every data warehouse must embody a few fundamental themes. The three most important are drilling down, drilling across, and handling time. Modern data warehouses so deeply embed these three themes that I think an "if-and-only-if" relationship has developed between them and a real data warehouse. If a system supports drilling down, drilling across, and handling time, then as long as it's easy to use and runs fast, it automatically qualifies as a data warehouse. But as simple as these three themes might seem, they give rise to a set of detailed and powerful architectural guidelines that should not be compromised. Drilling down, drilling across, and handling time are so important that I'll devote a separate column in this Fundamentals series to each. In this column, I drill down into drilling down, starting with a precise operational definition. Then, as a good engineer should, I lay out practical guidelines for building systems that do a good job of drilling down. Drilling DownDrilling down in a relational database means "adding a row header" to an existing
Incidentally, we often call a row header a "grouping column" because everything in the select
list that's not aggregated with an operator such as This example is particularly simple because it's very likely that, in a dimensional star
schema, both the manufacturer attribute and the brand attribute exist in the same product
dimension table. So, after running the first query at the manufacturer level, you look at the
list of attributes in the product dimension and opportunistically drag the brand attribute into
the query. Then you run it again, thereby drilling down in an ad hoc way. If the brand attribute
is indeed in the same dimension table as the manufacturer attribute, then the only adjustments to
the original SQL are to add You could just as well have selected the color attribute for drilling down rather than the
brand attribute. In fact, if you substitute the word The idea that you can expand any report row to show more detail simply by adding a new row header is one of the powerful ideas that form the soul of a data warehouse. A good data warehouse designer should always be thinking of additional drill-down paths to add to an existing environment. An example of this out-of-the-box thinking is to add an audit dimension to a fact table. (See my column "Indicators of Quality," April 10, 2000) The audit dimension contains indicators of data quality in the fact table, such as "data element out of bounds." But this audit dimension can be part of the drill-down process! Now you can devise a standard report to drill down to issues of data quality, including the proportion of questionable data. By drilling down on data quality, each row of the original report would appear as multiple rows, each with a different data quality indicator. Most of the report results should cluster under the "normal" row headers. Finally, it's even possible to drill down with a calculation, as long as you're careful not to
use an aggregate operator such as Building a System That WorksI can now make some precise technical comments about drilling down: 1. Drilling down is the most basic end-user maneuver in the data warehouse, which must support it in as general and flexible a manner as possible because there's no way to predict the user's drill-down path. In other words, every drill-down path must be available and supported with the same user interface gestures because the users see little conceptual difference between the various forms of drilling down described in the preceding examples.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|











