CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise





May 31, 2003

Is Hand-Coded ETL the Way to Go?

Absolutely yes, or absolutely no, depending...

by Gary Nissen
edited by Ralph Kimball

No matter how often people pose this question, it never fails to surprise me: Is it acceptable, or even recommended, to hand code the extract, transform, load (ETL) logic for a data warehouse, or is tool-based ETL always preferred?

My standard answer is that it's perfectly appropriate to custom-develop the ETL code. In the right situation, custom-developed ETL can be cheaper and more maintainable. But in other situations, a tool-based approach is better. It depends. It all comes down to understanding the scope and requirements of the project.

I guess what surprises me is that, because this decision is no different from any other technology decision in any other development project, I don't expect anyone to assume there's one best answer. You should consider all significant requirements, balance your options, and make a business decision. In other words, just follow your standard project plan, and it will lead you to the right decision.

A Simple Project Plan

In my experience, every well-run project follows steps similar to the following, in a reasonable and efficient manner:

  1. Write agreed-upon requirements with enough detail so that all parties generally understand each other and the project's scope. It's also a good idea to identify some of the tougher deliverables and agree to a general methodology to finish each. The principal parties to an ETL system besides the ETL developers are the original data providers (such as the production DBAs), the data cleaning and loading specialists, the presentation system DBAs who mainly build star schemas, and the application developers who in turn serve the business community.
  2. Have the ETL development team agree to the design decisions, then document them. Your goal is to design the simplest possible solution that satisfies the requirements. The design document should include technology decisions, data flows, and development standards.
  3. Develop ETL transformation modules and data flows according to the designs. Throughout the process, the developers must continually conduct tests to ensure that requirements are being met.
  4. Write documentation that supports DBAs, IT managers, and future developers.
  5. The data warehouse DBAs must conduct acceptance testing by working with the system to ensure both that it delivers what was requested and that what was asked for meets the actual needs.
  6. Put in place a training and support system to provide user guidance and feedback for improvements.

Most people see this list and assume that each step must be carried out to the extreme and, therefore, will take too much time and effort. That perception will doom a project just as quickly as skipping these steps will. You must follow each step just enough to successfully move on to the next step. Anything else is wasted effort.








IE Weekly Newsletter
Subscribe to the newsletter
    Email Address