Tricky Time SpansThe time dimension isn't nearly as simple as it looksIn some ways the time dimension seems like the simplest dimension. At least, if you're modeling calendar days, the time dimension is small and predictable. You can make up a 10-year time dimension in an afternoon using a spreadsheet; that's only about 7,300 rows. In the past couple of years, I've seen an increased demand for applications that need to ask questions about time spans. One person captured it nicely when he said, "Each record in my fact table is an episode of constant value over a region of time." Time spans can start and stop at arbitrary points in time. In some cases, time spans link together to form an unbroken chain; in other cases, time spans are isolated; and in the worst cases, time spans overlap arbitrarily. But each time span is represented in the database by a single record. To make this concept easier to visualize, imagine you have a database filled with atomic transactions, such as deposits and withdrawals from bank accounts. It includes open account and close account transactions. Each transaction implicitly defines an episode of constant value over a region in time. A deposit or withdrawal defines a new value for the account balance that's valid until the next transaction. This time span could be one second or many months. The open account transaction defines the status of the account as continuously active over a time span until a close account transaction appears. Before proposing a database design, let's remind ourselves of some of the time span questions we want to ask. Easy, Moderate, and Very HardStart off by limiting your questions to a granularity of individual days rather than parts of days, such as minutes and seconds (more on minutes and seconds at the end of this column). The easy queries include:
For these cases all you need is a single timestamp on the transaction fact table record. The first query picks up all the transactions with a timestamp in an interval specified in the user's query. The second query retrieves the timestamp from a selected transaction and compares it to the interval. The third set of queries replaces the simple timestamp with a calendar date dimension filled with a lot of helpful calendar attributes. This date dimension is connected to the fact table through a standard foreign key - primary key join. This is all-vanilla dimensional design and requires only a single time key in the fact table record to represent the required timestamp. So far, so good. By the way, when using complex calendar navigation, the queries become much easier if the verbose date dimension includes first day and last day markers for each defined span of time, such as "last day of quarter." This field would have the value N for all the days except the last day of the applicable quarter. The last day would have the value Y in the special field. These markers allow the complex business time spans to be easily specified in the queries. Note that the use of a verbose date dimension means the application isn't navigating a timestamp on the fact table (more about this in the last section). A second, moderately hard category of time span queries include:
Continue to make the simplifying assumption that all the time spans are described by calendar
days, not by minutes and seconds. It's possible to answer all these questions with the
single-timestamp design discussed earlier, but this approach requires complex and inefficient
queries. For instance, to answer the last query, you would need to search the set of account
transactions for the latest transaction at or prior to the desired point in time. In SQL, this
approach would take the form of a correlated sub-
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| |||||||||||||||||||||||||||||||





















