MetamorphosisMetadata features make something more of this ETL toolBy Steve ElkinsIn this Issue:
From a design and development perspective, a CIO ought to be looking for three things in an extract, transform, load (ETL) tool: a flattened learning curve, enhanced developer productivity, and a comprehensive and transparent metadata repository that features metadata sharing with leading data modeling and business intelligence tools. Metadata transparency is important to ensure that the original developer's successors can easily audit and maintain the finished ETL process. DataStage XE from Ascential Software Inc. (formerly Informix Business Solutions) scores well on all these criteria. The DataStage XE suite is an integrated data warehouse development toolset that includes a robust second-generation ETL tool (DataStage), a data quality tool (Quality Manager), and a metadata management tool (MetaStage). DataStage ETL ToolThe DataStage ETL component consists of four design and administration modules (Manager, Designer, Director, and Administrator), a metadata repository, and a server. The DataStage Manager is the basic metadata management tool. You use it to create and organize various metadata types, including legacy source and data warehouse target data definitions and ETL job components residing in the DataStage Repository. Through MetaBroker add-ins, the Manager can import database design metadata from data warehouse design tools and exchange business metadata with business intelligence tools. Reporting and documentation tools for viewing the detailed metadata in the repository are included. Usage Reporting that helps the developer identify dependencies among job components is also included (though I'd be happier if dependency analysis were automated). In the Designer module of DataStage, ETL tasks execute within individual "stage" objects that you assemble to create complete ETL "jobs." Database source and target stages can represent tables in a wide variety of data types, including mainframe databases. They reference the database design metadata defined within the Manager. Within the basic Transformation Stage, where you define the data transformation to perform between source and target, the Expression Editor exposes a long list of Basic language functions that you can easily assemble into transformation code and validate through a point-and-click user interface. Source and target data fields are mapped via drag-and-drop or an automapping wizard. In fact, except for having to manually code "where clauses" to define join conditions, I didn't have to touch the keyboard. You can assemble frequently used combinations of stages into reusable Container modules. Data warehouse developers will find almost all they need within the library of built-in stages, but you can extend the development environment with plug-in stages. Plug-ins for bulk loading Oracle and SQL Server databases ship with DataStage; but you can write new, custom plug-ins in C. DataStage also includes a simple debugger and job version control. Although the Designer user interface lacks pizzazz, I found it to be relatively uncluttered and easy to learn (for an ETL tool). The Director is DataStage's job validation and scheduling module. It supports both time- and event-driven scheduling. The DataStage Administrator is primarily for controlling security functions. The DataStage Server is the engine that moves data from source to target. It runs on either NT or Unix. You can program the Ascential Accelerator to leverage Torrent Systems Inc.'s CoSort to distribute multiple, independent jobs among several processors in order to speed overall project execution. Ascential promises true parallel processing for individual jobs for a summer 2001 release. New Mainframe SupportNew in this release of the suite is an XE 390 version of DataStage that can generate mainframe data extraction jobs from the same design tools used to create client/server ETL jobs that run on NT or Unix. If you're at one of the many companies that needs to integrate data from both legacy mainframe and open systems environments, this capability will be extremely valuable to you.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|











