Freedom of InformationBusiness metadata, though diffficult to provide efficiently, releases more of the potential of tabular data
By Shiraz Kassam Making decisions based entirely on operational data is problematic. Besides all the other problems with using data for decisions, such as limited data availability, quality, and timeliness, internal data is inherently unreliable as a sole source for decision-making. It lacks context. Considerable evidence exists to indicate how pure data is very poor at supporting decisions, and context is absolutely necessary to understanding the complete business environment within which the facts should be interpreted. Such an understanding is essential to good decision-making. Providing correct data and the context for it (through metadata) to the right decision maker at the right time, therefore, is an important challenge to overcome. Open Your EyesFirst, let's consider what metadata is. My favorite general definition of metadata comes from Robert S. Seiner of The Data Administration Newsletter (TDAN.com): Information that improves both the business and technical understanding of data and data-related processes; metadata enables end users to understand the data and make better decisions based on this understanding. Traditionally, we had to make due with the brief definition of metadata as simply "data about data." Seiner broadens the scope of the definition by drawing our awareness to the opportunities in metadata collection and use. He also draws awareness to the "customer" of the metadata a user who resides in the business realm and, therefore, can make a case for, and measure the value of, the metadata.
For our purposes, we want to apply this fairly broad metadata definition specifically to the data warehouse environment. We can do this using the conceptual model in Table 1 as a guide. Metadata in general can be divided into four categories, each with its own data artifacts (objects about which we would collect data). Each category contributes to the overall collection of the metadata that will make the data easier to use and more meaningful. The first three categories elemental-level meaning, technical definitions, and data movement definitions are reasonably well understood so I will not discuss them further. However, business metadata needs further explanation. Know the TermsBusiness metadata refers to that information that increases our understanding of traditional tabular (structured) data when it's formatted into reports. So it includes the business terms used in the reports; the methodology used to derive many data elements in the report; report uses and underlying assumptions; and external information and opinions (nontabular data). What constitutes business metadata is subjective and depends on the data consumer. The primary purpose of this metadata should be to provide context to the tabular data, thereby enriching the information. The context need not be the same for all users for example, the sales director looking at last month's sales report needs a breakdown by products, and giving context to this data could mean providing reports comparing data on a regional basis, competitor activities including marketing efforts, weather, or information on other social and political events that could have a bearing on the sales data (such as the Sept. 11th terrorist attacks). The same data when viewed by the production manager would require different contextual data: inventory in house, inventory in production, supply-line issues, state of the production facility, sales forecast, and cost and supply of raw material either in number or comments. The role business data can play is best understood by considering how decision-making currently operates with a data warehouse. The following discussion illustrates how the data warehouse is generally used in the corporation. Figure 1 depicts the traditional decision-making process using the information from a data warehouse. The diagram shows several tiers within the organization where various activities take place, all supporting the overall coordination of decision-making. At the top tier are the organization goals stated in a tangible format against which the progress of the organization can be gauged. The next tier is the Gap Analysis and Action layer, where we look at "where we are" vs. "where we should be." This is the reality check tier. If the gap between reality and expectations is getting wider, or new goals are identified, then management launches new business strategies and initiatives at this tier. The effect of these initiatives is reflected in the production systems at the lowest tier, the Operational System layer. This is the first place where the effects of the current business practices are recorded. These effects could be current sales or new orders being placed to suppliers, for instance. The effects of business strategy changes are first observed here, such as increased sales resulting from discounts in the third quarter. The operational-layer data is then rolled up to the Data Warehouse layer where various tools are used to interpret the data. Standard and ad hoc reports are generated. The next tier is the Business Intelligence layer where raw numbers produced by various tools, such as online analytic processing tools, are studied and evaluated. This layer is about understanding and knowing; it's where the data is generally evaluated and understood but no major decisions are made. Decision-making happens at the next layer up, Gap Analysis and Action. This cycle is more or less generic to most organizations that have some kind of data warehouse in place. Some have data that is wide-ranging; it may cover the whole range of business activities. Most have at least one business-functional area covered, such as sales and marketing. Notice in Figure 1 that it shows no formal process whereby nontabular data gathered from outside sources and in-house expertise are passed to the consumer of the data warehouse's data. This is a major gap in the flow of information commonly existing in BI systems. Another common problem illustrated here is the lack of formal integration of the metadata repository into the information loop. Dig Up the DetailsTo identify the data you need to collect and the metadata that will put it into the right context, you need to first identify the decision makers in your organization and then study them. The decision makers are middle- to upper-level managers who make tactical and strategic decisions. People with titles such as product manager, marketing manager, and divisional CFO are likely to be your customers. Next, you'll need to identify the kinds of decisions these individuals or groups make. Following their actions will help you determine the right information bundle (data and metadata) to provide to the right person at the right time.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|











