Guide to the TechWeb Network

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Advanced Search
RSS
Webcasts
Whitepapers
Subscribe
Home




June 17, 2003

Real Time: Get Real

Take the idea of a real-time data warehouse with a grain of salt, then realize the possibilities

by Neil Raden
edited by Ralph Kimball

Continued from Page 1

The ability to include up-to-the-second analytics is extremely useful in many applications, including call-center support, fraud detection, yield management, and many kinds of financial transactions. But proposed requirements for more timely data should be scrutinized to determine whether they'd produce bona fide business value.

Keep in mind that it isn't the data that determines whether a low-latency approach is appropriate: it's the application of the data — in other words, the underlying business process the data supports. Consider an insurance company in which an actuary analyzes assets and liabilities for adequacy testing using data on internal fund values and betas, and on loss and loss reserves. Does it matter if the data is from yesterday, last week, or even last month? Not really. The data isn't that volatile, and any recent observation date is adequate. However, a trader in the same company has an undeniable need for the most recent fund data possible to support intraday trading. Same data, different business processes.

Consider a railroad. The real-time location and status of every car and engine is essential information for customer service and dynamic trains rerouting. A logistics specialist looking for ways to reduce operating costs and boost load percentages, on-time performance, and so on would use the same kind of data but wouldn't mind if it were a day or two old.

Most likely, you've already designed a data warehouse with the data your business needs. You simply need to engineer a process for appending it more often. Although doing so isn't easy, it's simpler than designing an entirely new data warehouse. RTDWs are really nothing more than data warehouses with the load-related latency squeezed out.

Getting There

Designing RTDWs involves some new tricks. More important, it requires abandoning a few common practices. The first is shutting down your data warehouse for hours at a time — and only once a day — to load new data and do all the necessary housekeeping. Data warehouses have traditionally operated in two distinct modes: online (answering queries during the day) and offline (undergoing) maintenance at night. In RTDWs, these modes aren't possible. Eliminating downtime will change design principles. Because no updating occurred during the day, designers could index the tables any way they liked. And the star schema worked because it wasn't updated a row at a time.

The second practice that will have to end is creating downstream structures (data marts, OLAP cubes, and so on) to provide snappy query response. These conventions are notoriously slow at loading, calculating and aggregating. If the structure being queried takes longer to build than the refresh cycle allows, it will have to be replaced. For very low latency RTDWs, the database that's updated in real time has to be the one that's queried: There simply isn't time to create downstream structures.

Five or 10 years ago, data warehouses were updated on a monthly (sometimes weekly, but rarely daily) basis because architectures were slow, programs were clunky, data quality was poor, and errors and restarts were frequent. Bringing the system down, waiting for other batch processes to complete, and running the refresh end-to-end in a few hours was very difficult. Improvements in every category over the last few years have meant that daily updates are now routine. One thing hasn't changed: the need to bring the system down, leave it in an unstable state for some hours, and bring it back online when everything's correct. ETL methodologies are based on this concept. But because data warehouses must be available for as many as 18 hours a day, the daily refresh has become a pretty hard limit.



Rate This Article

Comments:

Optional e-mail address:

Imagine the Possibilities

The emergence of RTDWs will be driven by the opportunity to find entirely new business processes that were never considered before. The value of merging operational and analytic activities in real time has to be integral to the business process in question. In others words, don't look solely to your existing applications for candidates. The best application for the technology might very well be for you to create new versions of your business processes — things you could never do before.


Neil Raden [nraden@hiredbrains.com] designs and implements data warehouses and analytic applications for clients in North America and Europe.


RESOURCES

"BAM: Evaluating Tomorrow's Management Technology," Gartner, SPA-15-2590

Staffware business process management: www.staffware.com/landing/bpm








IE Weekly Newsletter
Subscribe to the newsletter
    Email Address







techweb
Online Communities TechWebInformationWeekLight ReadingIntelligent EnterprisebMightyNetwork ComputingDark ReadingDigital LibraryWall Street & Technology
Byte & SwitchNo JitterInternet EvolutionLight Reading's Cable Digital NewsContentinopleUnStrungBank Systems & TechnologyAdvanced TradingInsurance & Technology
Face-to-Face Events
InteropWeb 2.0 ExpoWeb 2.0 SummitVoiceConBlack HatCSISoftwareEntrprise 2.0 ConferenceGTEC
Mobile Business Expo
InformationWeek 500 ConferenceBuy Side Trading XchangeBuy Side Trading SummitBank Executive SummitInsurance Executive SummitTelcoTVEthernet ExpoOptical Expo
Magazines  
InformationWeekWall Street & TechnologyInsurance & TechnologyBank Systems & TechnologyAdvanced TradingMSDNTechNetSmart EnterpriseThe Architecture JournalDatabase Magazine
 
Research & Analyst Services  
Heavy ReadingInformationWeek ReportsInformationWeek Analytics