// Stop hiding from old browsers --> Intelligent Enteprise
CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise





March 27, 2001



Close the Loop

Integrate your warehouse with your operational systems before you are overrun

by Steve Tracy

As a data warehouse architect, you are faced with constantly changing rules around the role of your warehouse in the enterprise architecture. Expect your plans, architecture, and approach to be challenged repeatedly by IT and business management. Just when you are about to deliver, expect the rules to change - then change again.

I've had this task in my planner for two years: "Close the loop with the operational systems." These systems include ERP installations, legacy systems, and new ones under development. This goal has taken on an increased urgency with the new demands of the Web and the new focus on customer relationship management (CRM). Because ends are always coming loose, closing the loop is an ongoing task. But pursuing its completion is the key to long-term success for the data warehouse.

In his June 5 and 26, 2000 Data Webhouse columns concerning enterprise application integration (EAI) and enterprise resource planning (ERP) systems, respectively, Ralph Kimball clearly described a major shift in our industry. Just as the lively methodology debates over modeling seemed to be settling down, along came an entirely new, urgent Web-oriented spin on what used to be a low-priority integration effort.

You have a choice: Be a leader of change and integration, or sit back and react to each new application's demands on your data warehouse. I assume the vendors may not adequately address the integration needed to support data warehousing. They are simply too busy trying to enhance and maintain their incredibly complex database models and systems. And they may be trying to close their sales to marketing departments quickly without raising burdensome objections.

Closing the loop means more than consolidating your metadata and tracing your data lineage. It means making a commitment to a fully integrated data warehouse, including:

  • Actively participating in overall system planning and business strategy
  • Physically integrating the semantic layer of your data warehouse (business data elements and their metadata) with their operational world counterparts
  • Recasting your data warehouse information in extensible markup language (XML) and document type definitions (DTDs)
  • Weaning yourself off entity/relationship diagrams (ERDs) as a means of exposing the data warehouse data
  • Using query and extract-transform-load (ETL) tools to help manage your workflow, data quality, and information integration; using these tools to report on each other and build a consolidated view of your own processes
  • Working closely with the operational system architects to exploit your data warehouse toolkit, while also clearly delineating who takes responsibility for operational reporting.

Continuously Changing

When I consider the number of extracts, reports, repositories, ETL maps, and anything else touched by a change in a business rule or business data element, it's daunting. A renewed emphasis on integration is the only real hope for a graceful adaptation to change.

Take advantage of the current confusion about "who does what" in the integration of e-business with ERP. Put your stake in the ground today; make sure the webhouse will be the point of integration for your corporate business information assets by demonstrating you're the best prepared to enable that integration.

If this approach sounds risky, consider the alternative: spending most of your time in reactive mode, massaging the same information tediously for export and import in ever-changing formats. Your future influence depends on what you're doing today to prepare for it.

Not Just Buzzwords

If the terms EAI, ERP, and CRM are just buzzword phrases to you, take the time to understand what they mean. I divide these new subjects into two groupings:

E-business initiatives. Despite the premature obituaries investors have written for them, dot-com initiatives are alive and well. Business-to-business (B2B), business-to-consumer (B2C), and electronic exchange projects are mostly still young enough that they are more than willing to work with your methods and formats. If you are truly a data webhouse, these initiatives represent the "opportunities."

Enterprise applications and integration - CRM, BI, ERP, and so on. They're not young, and because of their inflexibility, you can count on them to cause you pain around integration issues. This group usually represents the "challenges."

Leverage Webhouse Tools and Partners

So, how can you plan for these changes and still stay on course with your data warehouse? What are the specific things you should be doing right now to ensure you're still piloting the ship next year?

You can alleviate at least some of the suffering by making the most of what you already have and choosing vendors that have an EAI vision. For example, I've seen a maturation of this EAI vision in Informatica and Business Objects, our primary ETL and BI tools. Informatica's ERP plug-in and common Metadata Exchange (MX2) features are examples of their commitment to an integrated future.

With any luck, most of the tools and techniques required for true integration are likely sitting in your current toolkit. Step back from your tools and think about applying them to closing the loop with your operational systems. Here's my take on the tools you already have:

ETL tools, modeling tools, and repositories. Integrate these ASAP; don't wait for the perfect "wizards" to make it all painless. Technologies such as MX2 are not perfect and certainly not completely integrated, but start with what's available. Find the strong and weak points in repository integration and representation of business rules. Cut the cord of using ERDs to communicate rules, which they do poorly.

At The Hartford Life, my coworkers Howard Galusha and Prasada Gunda continually advance the state of the use of cyclic redundancy checksum (CRC) algorithms in creative ways. They recognized the data quality improvement potential while I was still focused on reduction in time to load data.

Query tools and their repositories. Query tools with a rich semantic layer can be used to report on almost anything. I use Business Objects as a point of integration for reporting on disparate data sources, including Business Objects' own Universe structure. Doing so lets me quickly get the data and the implied relationships in a system or tool, while reinforcing my core query environment capabilities.

You should look into your ETL repositories, workflow tracking databases, Lotus Notes databases, report job request queues, and more. Use the alerts, scheduling, and consistent presentation from your query environment to check up on your own processes.

Advanced DBMS capabilities and your good DBAs. Data warehouse DBAs can effectively use cross-media text indexing, functional indexes, SQL extensions, and direct storage of unconventional data in your databases.

At The Hartford Life, we are already dealing with the interesting problems of text storage. We're facing the challenge of managing the unstructured data through tagging and storage in new applications, similar to what Kimball describes in the October 20 and November 10, 2000, Data Webhouse columns "The Keyword Dimension" and "Fact Tables for Text Document Searching."

Integrated Definitions

If no one else in your company has developed a consolidated view of your corporate knowledge base, you must. It's hard to be "first up," but this step is essential. You can't afford to wait for someone else to finish the next big Global Business Information Model in the uncertain future.

Build your consolidated view in a standard set of XML-based DTDs to represent the data warehouse information. If you have such a message-based architecture layer in place around the data warehouse, it will become the API you then provide for information access.

Your consolidated view takes the form of XML and the DTDs you define for your current business domain. Achieving true cross-company conformity within a business segment may require an effort on the level of some of the early EDI development projects. In my case, I'm happy to get at least my own department's information in a low-maintenance format for consistent internal and external use.

Remember the Web in Webhouse

It's a daunting task to try to add value to the Data Webhouse Toolkit (John Wiley & Sons, 2000). Kimball and coauthor Richard Merz lay out a solid framework to prepare your data warehouse for the Web. This framework includes extending the existing data warehouse model to prepare for clickstream analysis, recognizing and building new clickstream value chains, and applying data-mining techniques to the mountains of information collected from Web server logs.



Rate This Article

Comments:

Optional e-mail address:

If your impression is that the book is just about CRM and clickstream analysis, think again. Part of the Data Webhouse Toolkit is about integration and preparation for the challenges I described previously. The same central concepts, which enable you to become truly integrated with the Web, also apply to becoming flexible and adaptable enough to integrate with the systems in your own backyard.

"Closing the loop" is about maximizing your resources and your ability to stay focused on core data warehouse goals. If you take a purely reactive role, double your head count and shorten your project list. You will be spending an increasing amount of your time extracting information or receiving data from every well-connected business application that comes along.

As Robert Scott, my friend and partner at The Hartford Life, frequently states, "It ain't that hard, or at least it doesn't have to be." Investing in closing the loop is your choice to make it easier.

 

Guest columnist Steve Tracy (steve.tracy@hartfordlife.com) is an assistant director of information delivery for The Hartford Life Insurance Company and has built production data warehouses in the healthcare, retail, and environmental engineering sectors.







IE Weekly Newsletter
Subscribe to the newsletter
    Email Address