Real-Life Data Mart ProcessingIs your mart the symmetric "information diamond" at the end of the data pipeline?By Gabriel Tanase Continued from Page 1 I've seen three kinds of nonlinear aggregations in the data marts I've built. All of these are from service industries such as insurance. I call the nonlinear aggregation types MSI (for "multistep iterative"), PTS (parameterized time series), and MTS (multiple time series). An example of an MSI nonlinear aggregation is the calculation of a series of estimated future repayment or residual values for a loan over a number of years. The calculation is iterative because the formula used gives the value for year n+1 starting from the value calculated for year n in a previous step. An example of a PTS nonlinear aggregation is the calculation of the "booked" (or "earned") value of a paid-in-advance service such as insurance policy premium, during the service lifetime. The booked value at a given target moment is calculated as a percentage of the advance payment. The percentage applied depends nonlinearly on the time span between the service inception date and the target time for calculation, which can be "now" or a time in the future. When such a percentage is not computed on the fly, but taken from a table, the time granularity of the parameter table constrains the time granularity of the measure calculated using it. An example of an MTS nonlinear aggregation is the calculation of booked value of a paid-in-advance service like the previous example but additionally with the presence of an open-ended number of additions or changes in service components such as adjustments or cancellations. Any such calculation must carefully take into account all of the separate time spans between the service inception date and the dates of these service changes. AGGREGATE MEASURES IN ATOMIC MARTSThere is one final real-life issue challenging the conventional vision. We may have to deal with data that is:
Furthermore, when such aggregate-only data is assembled together with summary items obtained via normal count-and-sum aggregation from the atomic level, the process of calculating more derived business measures will create an asymmetry in the data mart. Some higher-level summaries will include the summary-only data, while others won't. Therefore, aggregation results from different paths may not match when compared at the top level. By any conventional vision, this situation is very undesirable in a data mart. STAY TUNED: A STRUCTURE FOR REAL LIFEI have laid out several real-world threads most data mart designers are likely to encounter: three distinct usage styles (the likelihood that the data mart is not the end of the information delivery pipeline; the existence of complex nonlinear aggregated measures; and the conflict between source-aggregated and data mart-aggregated data that should produce the same results but don't). In my next column, I'll draw these threads together and suggest an adaptable structure you can use to tackle these issues if they arise in your environment. I'll show you how I create a specific asymmetric aggregate level of the data mart. We'll look at how to recognize business requirements for it and decide how to build a custom user interface for storing the underlying component factors of the nonlinear aggregations in recognizable data mart dimensions. Gabriel Tanase [gabriel@gabrieltanase.com] is a system designer based in Ireland. He has worked on several business intelligence projects for a leading European insurance provider.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| ||||||||||||||||||||||||||||||||









