Guide to the TechWeb Network

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Advanced Search
RSS
Webcasts
Whitepapers
Subscribe
Home




May 28, 2002

Managing Spaghetti Content

Every content management application demands a well-ordered taxonomy. The challenge is to maintain taxonomy quality as content evolves over time

By Philip Russom

Continued from Page 1

Maintaining Taxonomy Quality

Here are a few tips — learned in the trenches — for assuring taxonomy quality over time.

Hire librarians. Content managers have argued for years whether humans or software should design taxonomies and deal with the tedious task of tagging source documents. "Portal in a box" software products are great for categorizing massive bodies of content, but not so great with accurately tagging documents. If accuracy is what your organization needs from a taxonomy and its content, then hire librarians — specialists who design a meaningful taxonomy, tag content as it appears, and assure quality over time.

Yahoo employs about 200 librarians who assure accuracy and quality for quite possibly the largest, richest taxonomy on Earth. That's a big payroll investment — one that few other companies can rationalize or afford. Yet, an organization whose processes depend on a meaningful, accurate taxonomy of content must budget for hiring librarians — a potential problem, because many content management applications are departmental in scope. Departmental budgets can't always support dedicated librarians.

Establish policies. Develop, document, and enforce policies for altering the taxonomy's structure and inserting documents, reports, and so on in its directory. Policies are especially important when end users contribute or tag content, but librarians need policies, too. The policy should define who does which tasks, procedures for performing tasks, and feedback mechanisms for suggesting changes and improvements.

Automate with software. Sometimes the volume of content is too great for librarians to study and tag manually (for exampe, when an organization needs to retrofit a content management application over a large, preexisting body of documents or aggregates content collected daily from numerous external sources).

Software — such as a portal, text-mining tool, or categorization filter — can automate document tagging and the topic discovery. For the highest quality, however, librarians should still be involved. An emerging best practice for content management requires that librarians create topics and rules that define the topics, whereas software applies the rules to tagging incoming content. The mix of humans and machines assures the quality of content management, while scaling up to massive volumes.

Revise periodically. Even with librarians, policies, and software automation, a certain amount of degradation is inevitable. Assuring quality over time requires periodic review of the taxonomy and the content it represents (see Figure 1).

Know your content. Some taxonomies are doomed from the beginning because people didn't first study the content being managed before designing them. When technical personnel design a taxonomy, they should spend time with end users who know the content they're consuming. Also, text mining and categorization software tools can help you discover clusters of topics before you model the content with a taxonomy.



Rate This Article

Comments:

Optional e-mail address:

Remove old content. Many end users don't need or want to encounter documents or topics that are "old" by some standard. For instance, a staff writer focused on news reporting probably doesn't want to see press releases that are more than a year old. But a columnist may need to research older sources to inject an editorial with a sense of history. When a content management application is presented through a corporate portal, the portal's personalization capabilities can hide or reveal content based on its age. It's especially important to hide old content, so that it doesn't lower the quality of searches and queries by littering results with irrelevant and distracting source documents. When possible, archive old content as part of your quarterly review process.

Never Done

The bad news is that the job's never done, because you must diligently maintain the quality of a taxonomy as it (and the content it represents) boils like a pot of spaghetti. The good news, however, is that — if you keep the taxonomy current, meaningful, and accurate over time — end users will keep coming back for more, making your content management application a success.


Philip Russom, Ph.D. [www.PhilipRussom.com] is a Giga Research Director at Forrester Research Inc., where he provides advice to user organizations about business intelligence, data warehousing, and data integration.


RESOURCES

Related Article at IntelligentEnterprise com:

"An Eye for the Needle," Jan. 14, 2002 www.intelligententerprise.com/020114/502feat_1.jhtml










IE Weekly Newsletter
Subscribe to the newsletter
    Email Address







InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space