CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise





October 4, 2001



Analytics on Demand: The Zero Latency Enterprise

The ability to collect and analyze information in real time is a cornerstone of the "intelligent" organization

By Colin White

I approached writing this article with some trepidation for two reasons. First of all, about a year ago, I gave a presentation on realtime data warehousing to a group of CIOs. The feedback from some of the audience was less than positive - they were alarmed that I would suggest to business users that they should expect information out of a decision-support system in real time.

EXECUTIVE SUMMARY

Colin White

In the volatile e-business environment, organizations need to provide business users with analytics in real time if they are to remain competitive, provide good customer service, and optimize e-business operations. To enable the so-called zero latency enterprise, you need an architecture that can collect and analyze information continuously, in real time, and make the results available without delay to business users.

After all, OLTP systems may be designed for realtime operations, but the current batch approach to building data warehouses is not.

Second, the IT industry buzzword generators have been hard at work, and terms such as active data warehouse, zero latency enterprise (ZLE), realtime data warehousing, realtime analytics, business activity monitoring, and realtime personalization all require explanation. Could I really sort out all these issues and terms in a few thousand words?

In this article, then, I hope to convince both executives and technicians alike that there is a role and a business case for realtime operations in data warehousing and business intelligence (BI) systems, and that the necessary supporting technology does exist. I will also explain the various industry buzzwords and show how various kinds of realtime decision processing can integrate into a framework for supporting what I call the intelligent business.

For convenience, throughout the article, I will use the term realtime decision processing to describe processing that supports rapid access to information and analyses from any place at any time for making realtime business decisions.

Realtime Systems Emerge

Data warehousing and BI applications are usually considered to be separate standalone decision-support systems used for strategic planning and decision-making. As these applications have matured, however, it has become apparent that the information and analyses they provide have also become vital to tactical day-to-day decision-making, and many companies can no longer operate their businesses effectively without them. Consequently, there is a trend toward integrating decision processing into the overall business process. The increasing use of e-business is also encouraging this integration because organizations need to react much faster to changing business conditions in the e-business world.

The integration of decision processing into the overall business process is achieved by building a closed-loop system where the output of decision processing applications is delivered to business users in the form of recommended actions (product pricing changes, for example) for addressing business issues. In the e-business environment, many companies are looking to extend this closed-loop processing to automatically adjust business operations based on messages generated by a decision engine. Some companies would also like this automated closed-loop processing to occur in real time.

The use of the term "realtime" in the context of decision processing is controversial because it suggests that a realtime decision-support system must be able, in a matter of seconds, to extract and cleanse operational information, transform, and load the extracted information into a data warehouse; analyze the transformed information; route the results of the analyses to a decision engine; and generate appropriate business recommendations or action messages to modify business operations.

Clearly, this process is not feasible. In reality, however, realtime decision processing is not done in a single monolithic step, as is usually the case in operational processing. Instead, three main facilities are involved in realtime decision processing, each of which can operate and be used independently (see Figure 1):

  • An event-driven hub that can, in real time, capture and transform operational and e-business data and load it into a data warehouse
  • An analysis engine that can generate and provide access to current business analyses from any place at any time
  • A rules-driven decision engine that can make recommendations or create operational and e-business action messages in real time (more on decision engines later).

Building a Data Warehouse in Real Time

Most data warehouses are built by user-written programs or vendor-supplied extract-transform-load tools that extract data from operational and e-business source systems, clean and transform the extracted data, and then load it into a data warehouse. This processing normally involves batch jobs that take periodic (daily or weekly, for example) nonvolatile snapshots of the source data.

To support realtime data warehousing, this batch snapshot approach to extracting source data must be replaced by processes that continuously monitor source systems and capture and transform data changes as they occur, and then load those changes into a data warehouse, in as close to real time as possible.

The closest existing decision-support systems get to realtime processing is in an operational data store (ODS), which comprises an integrated store of detailed near-current operational data. Depending on application requirements, the currency of ODS data may be within a few seconds, or a few hours, of the operational source systems from which it is captured. To maintain this near realtime view of operational data, ODS data-extract routines usually capture data continuously from source systems via database triggers, data replication, or the extraction of information from database recovery logs. Vendors have also begun building capture interfaces to Web server logs and clickstream data, and to messaging and enterprise application integration (EAI) software products. IBM, for example, recently released a bridge from DB2 Warehouse Manager to its WebSphere MQ (formerly IBM MQSeries) messaging product. Informatica Corp. also has added support for IBM WebSphere MQ, Tibco Software Inc., WebMethods Inc., and Vitria Inc. to its PowerCenter data warehouse software.

All the approaches used to build an ODS are equally applicable to building a realtime data warehouse. In fact, vendors that claim support for realtime decision processing frequently have an ODS at the center of their solution. Without getting into an unnecessary theoretical debate, it is sufficient to point out that the information store used for realtime decision processing usually contains nonvolatile data and may also contain summarized data. This information store therefore behaves more like a data warehouse than an ODS, which is why you could accurately describe such a store as a realtime data warehouse. (There are several other terms in use here as well: NCR Corp., for example, uses the term active data warehousing to describe realtime data warehouse operations.)

The move by vendors toward supporting messaging and EAI software in data warehousing products is a key first step toward providing a realtime data warehouse solution. In Figure 1, this feature is identified as an event-driven integration hub; such a hub enables links to realtime data feeds (such as point-of-sale terminal telephone systems) and leading front- and back-office operational packages from companies such as SAP and Siebel Systems Inc.

One question that arises about the use of an event-driven integration hub concerns the need for data warehousing tools to transform and load captured data into the realtime data warehouse: Given that messaging and EAI also perform data transformation, why can't these products be used in place of data warehousing tools?

The position of the data warehousing vendors on this topic is that their existing tools offer more powerful capabilities for data cleanup, transformation, integration, and management than those offered by messaging and EAI vendors. Messaging and EAI software does have the benefit of being designed from the ground up for the high-volume processing of continuous data streams, and this architecture is ideally suited for realtime processing. In contrast, data warehousing tools were designed for the batch processing of data, and they must be modified to support a data pipelining approach if they are to support realtime data warehousing. Data warehousing vendors such as Informatica have already begun this process by adding facilities such as parallel and always-connected sessions for the reading, writing, and transformation of data. It remains to be seen, however, how the data transformation debate will pan out in the future.







IE Weekly Newsletter
Subscribe to the newsletter
    Email Address