Ghost in the MachineReaching consensus on the meaning and value of metadata is an important step toward making the best possible use of enterprise informationBy Malcolm Chisholm What is metadata? The answer to that question is usually "data about data," but that definition provides few specifics. Nor does it provide a level of understanding that could help you "do" something with metadata. Rather than concentrate on a definition of metadata, it's more worthwhile to look at its characteristics to see if it really can stand as a distinct class of data in an enterprise information architecture. If you take this approach, you are more likely to find what makes metadata unique, and you'll be in a better position to judge how valuable it is to devote resources to managing it. This task is particularly important at the present time because many enterprises are addressing what they perceive as metadata issues, and are prepared to invest significant financial resources to address them. These enterprises may have a number of goals for metadata, ranging from more effective management of their information architectures to generating actionable information that can directly support their businesses. A deeper engagement with metadata is likely to be a hallmark of a more intelligent enterprise in the coming years. METADATA IN BUSINESS
In the world of IT, data means "the stored representation of facts." This definition also holds true for metadata, and metadata is stored in the same way as data in computerized databases. There is no doubt that metadata can be characterized by entities, attributes, and relationships just like data. For instance, a CASE tool may publish a data model of its repository, which will use all the constructs that you can find in any other data model. So there is nothing special, unique, or different about metadata in this regard. Superficially, the statement "metadata is data" seems redundant. However, many people use the term "metadata" in a manner that implies a substantial, obvious difference with data. Thus, if there is no inherent difference between data and metadata, you'll have to look more carefully to distinguish between the two. If metadata is data about data, then regular data must represent facts about something anything that isn't data. But is that really the case? In our modern economy, many activities are premised on the management of information. For instance, certain companies maintain databases containing individuals' credit histories. These companies collect this information by various means and sell it to organizations that need to make credit decisions about these individuals, or otherwise check their creditworthiness. A great deal of the information in the companies' credit reporting databases describes what they do with these credit histories, such as the prices for which this data is sold. Surely this information must be considered metadata because it fits our definition.
Those of us who are information resource management (IRM; also known as data administration) professionals tend to think that any data that directly supports the business model of an enterprise is "data," and metadata is strictly data about this data. This concept is difficult to apply in companies that are in the business of selling information as a product, as in the previous example. To consider data about an information product as metadata seems rather unhelpful. After all, what is different about enterprises that make a living by selling information? The people who work at these companies don't think they are on one side of a divide between data and metadata. Their IT staff probably don't either, so they wouldn't create a radically different systems architecture just because the company's sales, marketing, and administrative systems are in a large part working with what you could technically consider metadata. METADATA IN IRMOne way of getting out of an ever-expanding circle of metadata is to classify it into different groups. What I have just been discussing can be called "business metadata" it happens to be metadata, but it's primarily of interest to business users. What is more interesting to IRM staff is "technical metadata" data about the information resources of an enterprise that helps in the management of these resources. To IRM staff, that is "true" metadata. But is that really the case? IRM is administrative in nature, and has the look and feel of a central support service, like Human Resources. Yet HR staff don't call the data they work with "human data," or something similar, to distinguish it from other data. Why do IRM staff feel compelled to distinguish the data (metadata) they use to manage the enterprise's information infrastructure from the actual data contained in that infrastructure? The distinction may be useful to IRM staff, but does it mean that metadata is really different? Consider an example of an IRM function that involves a metadata management need. Suppose you work in an IRM unit that has been given the task of defining what is meant by "Customer" in our enterprise. You build a small repository (a database that houses metadata) to help us with this project. This repository tracks things like details of the databases that contain customer information across the enterprise, the business units that own these databases, semantic mapping among these databases, issues raised regarding the definition of Customer, and so on. Such a repository will be qualitatively indistinguishable from other databases in the enterprise. Therefore, you must conclude that even metadata used by the IRM function is structurally identical to any other kind of data managed by the enterprise. The reality is that the term "metadata" is often just shorthand for the data needed by IRM to do its job. In fact, metadata may actually be more frequently used to refer to the business processes that are the responsibility of IRM, and not directly to any kind of data. FRONTIERS OF IRM METADATAIf you confine yourself to viewing metadata as the data that the IRM function requires to do its job, can you at least place a boundary around it? That seems to be the proposition underlying many projects to build repositories for use by IRM. It would certainly be very helpful if a finite and knowable set of metadata was required by the IRM function in every enterprise.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|











