Guide to the TechWeb Network

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Advanced Search
RSS
Webcasts
Whitepapers
Subscribe
Home




December 5, 2001

Ghost in the Machine

Reaching consensus on the meaning and value of metadata is an important step toward making the best possible use of enterprise information

By Malcolm Chisholm

Continued from Page 1

Most people would agree that there is a core set of metadata that every IRM function will care about. This metadata includes the definitions of entities, attributes, and relationships — the very stuff of data models. The circle can be widened by including the physical as well as logical descriptions of databases, thereby bringing in things such as table names, column names, and data types. However, many organizations also expect their IRM function to provide impact analysis on data changes; for example, to help identify which programs, screens, reports, and interface files (at a minimum) use a particular column. Thus, programs, reports, screens, and so on must be tracked along with their use of physical data columns. This fact makes the metadata managed by the IRM function even more broad. In addition, IRM functions are often required to do much more, such as:

  • Enforce data quality. Specific metadata may include measurements of data quality in particular columns.
  • Migrate data from legacy systems. Specific metadata may include mapping of reference data values.
  • Deploy data extraction-transformation-loading processes to build data marts and warehouses. Specific metadata may include source of load data and time of update.

This list could go on much longer. The point is that it would be impossible to limit the metadata that an IRM function may have to manage.

THE BIGGER PICTURE

If you can't pin down the kind of metadata that the IRM function truly manages, what about other kinds of metadata? "Technical" metadata includes more than the metadata an IRM function manages, and if you add in "business" metadata, then you must be addressing a very large universe of metadata indeed. It is still true that metadata is data about data, but is this distinction now meaningless?

One unique property of metadata is that it can't exist without preexisting data. This data may not be present in a database, as in the case of a data model for a yet unimplemented physical database. In these cases, the metadata must be mostly related to design. When you look at data that does exist in implemented databases, there's no limit to the diverse kinds of metadata that can be built around it. In such databases, if an underlying item of data is created, changed, or is deleted, then the possibility exists that associated metadata must change in tandem. For instance, if a bank adds a new account record to its Account table, it won't change the semantic definition of Account, but it will change the count of the number of account records, and perhaps also a statistic that stores the rate at which new accounts are opened.

A special feature of data stored in databases is that it exists in an enabling infrastructure that permits the gathering of metadata. If supermarket managers wish to record how many people visit a particular aisle at various times of the day, they will have to physically place a person there to monitor this behavior. If an online stockbroker wishes to analyze what times of day people ask for stock quotations, the system's infrastructure probably requires little change to automatically gather this information.

The fact that enterprise information architecture can be used in this way is now beginning to dawn on a great many people. For instance, business metrics — measuring the efficiency of the business processes executed by various organizational units — seems to be a natural fit with this approach. It means that a great deal more value can be extracted from the existing information architecture with relatively little additional investment.

It may well be that implementing this kind of metadata is likely to collide with the myopic project-focused view of computerized applications and databases. A future challenge for IRM may be to ensure the integrity of the links between the data and metadata.

METADATA'S MEANING

Metadata is data about data. However, the term is often used as a shorthand for the unique business processes that are carried out by the IRM function of an enterprise. Although it is necessary for the IRM function to easily refer to the data it uses, calling it metadata may be misleading because simple ownership by IRM isn't a precondition for that term. IRM staff may think that they can define all the metadata that they need to use, but this goal is probably unattainable.



Rate This Article

Comments:

Optional e-mail address:

Moreover, a vast amount of metadata exists that isn't managed by the IRM function, and is used elsewhere by business users or other kinds of IT staff. This metadata often receives little attention from IRM staff, or is treated by them as if it were regular data. The fact remains that it is metadata because it derives from data.

It's true that metadata exhibits all the characteristics of regular data, but it differs in one important respect: Metadata has linkages (or relationships) to regular data, such that when the regular data changes the associated metadata may need to change synchronously. In practice, many of these linkages are currently maintained by human analysts, but in the future you can expect greater automation. Computerized information architectures can easily permit metadata to be defined and automatically updated as the regular data upon which it is based changes. There are many potential uses for such metadata, and it is likely to have even more value in the future.


Malcolm Chisholm [mchisholm@refdataportal.com] has more than 20 years of experience in IT, with particular focus on extracting metadata from data models for use in software applications. He is the author of the recent book, Managing Reference Data in Enterprise Databases (Morgan Kaufmann, 2000), and maintains the Web site www.refdataportal.com.


RESOURCES

Marco, D. Building and Managing the Meta Data Repository: A Full Lifecycle Guide (Wiley, 2000)

Tannenbaum, A. Metadata Solutions: Using Metamodels, Repositories, XML, and Enterprise Portals to Generate Information on Demand (Addison Wesley, 2001)







IE Weekly Newsletter
Subscribe to the newsletter
    Email Address







InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space