Guide to the TechWeb Network

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Advanced Search
RSS
Webcasts
Whitepapers
Subscribe
Home




September 17, 2002

Two Powerful Ideas

The foundations for modern data warehousing

by Ralph Kimball

Continued from Page 1

Another example of a high-end analytic tool the data warehouse should avoid is the allocation system for assigning costs to the various lines of business in your organization so as to compute overall profitability. Not only can this be a complex processing step outside the capabilities of most query and reporting tools, it's also a political hot potato. Let the finance department do the allocations, and you (the data warehouse) will be glad to store the results.

Symmetrical Stars and Cubes

Most presentation areas today are dominated by relational star schemas and multi-dimensional OLAP data cubes. These data structures have proven over the last 30 years to be the ones end users can understand. Remember that understandability is one of our two nonnegotiable design constraints.

The simplicity of the star schemas and OLAP cubes has allowed smart software designers to focus on very powerful algorithms for rapid query performance. Remember that speed is the other nonnegotiable design constraint.

The symmetry of both the star schema and OLAP cube also makes for:

  • Predictable user interfaces that can "learn" what to do at query time
  • Predictable administrative scenarios in the staging area because all the data structures have the same familiar look and feel
  • Predictable implementation responses whenever new types of data are made available.

Of course, the star schema and OLAP cube are intimately related. Star schemas are most appropriate for very large data sets, with many millions or billions of numerical measurements, or many millions of members in a customer entity or a product entity. OLAP cubes are most appropriate for smaller data sets where analytic tools can perform complex data comparisons and calculations. In almost all OLAP cube environments, it's recommended that you originally source data into a star schema structure, and then use wizards to transform the data into the OLAP cube. In that way, all the complex staging area ETL tools that deal with flat files and entity/relationship schemas can be part of the OLAP data pipeline. And, of course, hybrid star schema-OLAP systems allow very large data sets in star schema format to be smooth drill-down targets from smaller OLAP data cubes, all under a single user interface.

The Big Payoff

The final big payoff for building the presentation system in the data warehouse around symmetrical star schemas and OLAP cubes is the predictable set of points of commonality for linking together data from across the enterprise.

In my next column, I'll lay bare the techniques for conforming the definitions of the dimensions and the facts across all the disparate data sources of your large enterprise. These conformed dimensions and facts will be the basis for a data warehouse bus architecture — a set of standard connection points that provide power to your data warehouse, just like the bus bar in a power station provides power to all the transmission lines, and just like the bus in your computer provides data to all the peripherals.



Rate This Article

Comments:

Optional e-mail address:

What Have We Accomplished?

So far we've implemented two powerful ideas. First we've logically, physically, and administratively separated the systems in our environment into four distinct types. You really do need four different computer systems, but you're responsible for only two of them! Our two data warehouse systems also allow us to separate the incompatible responsibilities of data staging and end-user querying. Second, we've populated our data warehouse presentation area with star schemas and OLAP cubes, the only structures that are understandable, and fast, and can stand up to the ad hoc attack.

Although we clearly haven't addressed all the complex design constraints and unavoidable realities, we've chipped away at an impressive part of the overall challenge, just by leveraging these two powerful ideas. Go back to my previous column and look through the list. We've effectively addressed large parts of understandability, query speed, all three types of costs mentioned in that column, the risks of inappropriate centralization, the need for incremental development, handling continuous change consisting of little surprises and big surprises, and how to think about the role of data marts. Maybe there's hope. To be continued....


Ralph Kimball [www.ralphkimball.com] co-invented the Star Workstation at Xerox and founded Red Brick Systems. He has three best-selling data warehousing books in print, including The Data Warehouse Toolkit, Second Edition (Wiley, 2002). He teaches dimensional data warehouse design through Kimball University and critically reviews large data warehouse projects.


RESOURCES

Related Article at IntelligentEnterprise.com:

"Design Constraints and Unavoidable Realities," Sept. 3, 2002








IE Weekly Newsletter
Subscribe to the newsletter
    Email Address







InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space