Nelson KingManaging ContentThe CMS is a database system and XML parser that takes the information from an XML or SGML document and stores it neatly in a database (and later recreates documents from the data)The CMS is a database system and XML parser that takes the information from an XML or SGML document and stores it neatly in a database (and later recreates documents from the data) The Web is probably the only institution where committees turn problems into opportunities. Specifically, I mean standards bodies such as the World Wide Web Consortium (W3C) and its new extensible markup language (XML) specification. A problem the lack of support for structured data in HTML became the reason for XML. Developers are heralding XML as the next big Web revolution, and dozens (if not hundreds) of companies are jumping to find a way to exploit the change. One of the first of these companies is Poet Software Corp. with its new Content Management Suite (CMS). If youre wondering what content management is, youd be starting at the same place I did when I began this review. I was familiar with the term document management, which sounded similar, but the two are obviously different. The difference is in the tags. If you run a business in which storing documents is important (for example, insurance or real estate), you may already be familiar with one of the many document storage and retrieval products that use images of documents stored in files. There may be some metadata about the image, such as its date and location, but this information is limited. By contrast, you can use a markup language in content management to insert tags into a document (also stored in a file) that can convey a great deal of information. Originally, such documents used standardized general markup language (SGML), which posed some problems for the Web, primarily because SGML is too complex for efficient network transmission. Thats why the W3C came up with XML as a subset of SGML; its streamlined and simpler, yet it can do about 80 percent of what SGML can do. XML can identify such structural elements of a document as title, chapter, section, and paragraph. It can also identify field names <AUTHOR>W. Shakespeare</AUTHOR> and provide other information about the document. The question is, how does the program that reads the document know what the tags mean? The answer is that the document may contain the definition for all its tags the Document Type Definition (DTD). Using the DTD, an XML parser (the program that reads the document) can translate all the tags into meaningful information. XML documents can become highly structured and also provide key information, such as selected topic words or key identifiers, which are useful for database indexing. This is where Poet CMS comes into play. The CMS is a database system and XML parser that takes the information from an XML or SGML document and stores it neatly in a database (and later recreates documents from the data). I was somewhat surprised to find that while the Poet literature and Web site emphasize XML, the product and manual are quite SGML oriented. Im surmising that Poet wants to take advantage of the already-established SGML market to provide a sales bridge for the day when XML becomes more widely used. This business approach is easily understandable, but there are some side effects. SGML appeals to technically sophisticated users, and the tone of the CMS product reflects this SGML approach; average users will not feel comfortable using this product. SetupThe Content Management Suite is composed of three pieces: the Content Client, Poet Object Server, and Software Development Kit (SDK). I tested a production version of the client and Object Server, along with a late beta of the SDK. You can install the package on one machine for development purposes, but in production you would normally distribute it among Windows NT workstations and servers on a network. The installation is fully automated (a typical InstallShield opus) and doesnt present much difficulty unless youre using something other than the Microsofts Web server, Internet Information Server. The suite supports other Web servers but requires some customization for setup. The key component is a complete copy of Poet 5.0 Object Server, including its administrator and development modules. (See Figure 1.) This is where Poet gets a healthy jump on the competition by virtue of a true object-oriented database management system (ODBMS). I wont go into the pros and cons of object databases vs. relational databases, but lets say that if youre going to store a lot of text documents (as objects) along with their content, including graphics as objects, then an ODBMS is well-suited for the job. Poets Object Server is highly regarded and has been around long enough to demonstrate its reliability. Its an excellent base for a content management program. Data ImportContent Client turns the Object Server into the engine for content management. (See Figure 2.) It provides the user interface for the system and is the home of the XML parser, which does most of the work. You can typically create the documents stored in CMS in an SGML or XML editor and then import them. The process is relatively simple; you must first select (or construct) a DTD if the document doesnt have one, then define a publication specification to tell Poet about the DTD and other location information. Poet provides a number of DTDs for standard document formats. Unlike many other kinds of databases, you dont need to go through the definition of fields and field attributes to create a Poet database; you can create it directly from the DTD. Although the Content Client uses the familiar Windows Explorer-like hierarchy of objects to represent documents and their components, it also employs SGML jargon and assumes that users will understand everything about DTDs, object components, Object Queries, and document structure. You can navigate a document and show its contents with the other Content Client windows. The Content Client is quite simple to use, once you become familiar with the general approach. As you import the document, you can parse its contents (using the tags) and store the various components as individual objects in the database. When you create the publication specification, you can select from the DTD those tags that you want to use as components. In this way, the components of a large document (for example, a book) can be stored in as granular a breakdown as necessary (for example, book, chapter, section, paragraph). When it comes time to edit these components, Poet CMS shows off the value of its database engine. Several people can edit the document simultaneously, although they cant check-out the same component. Poet CMS also protects data integrity, concurrency locking, and transaction management (including the ability to roll back an edit). SearchingI should emphasize that systems like CMS are not intended for collecting numerous short documents, such as letters or flyers. It earns its keep with very large or complex documents, such as a parts catalog or engineering documentation, where organizing and finding the information is beyond the ability of human memory. This is particularly true for locating information in the database, where CMS employs the object query language (OQL), a version of SQL designed for object databases. If youre familiar with SQL, this form of searching will seem easy, but the CMS approach is sort of bare-knuckled; most users will require training. You can also search for keys that are imported with the XML or SGML tags, which can be the quickest method. While the CMS can perform searches and queries, these tasks shouldnt be confused with such text indexing and searching programs as Verity Corp.s Search98 or Oracles ConText. CMS searches are based on tags; the others search and index all text within a document (which would be far too much manual work to do with tags). Consequently, Poet CMS includes a license for the Verity text search engine. Checking OutCMS keeps a record of every check-out and check-in of document components, and if anything has been changed, it will keep a copy of both the old and new versions. Later, you can edit or even recombine versions into new documents, which is part of what is called repurposing content. You can repurpose content with Poet CMS, although not as easily as in some similar products such as Texcel Internationals Texcel Information Manager. However, I liked the way that CMS creates editions that specifically mark and label a particular version of a document. To view or edit documents stored in CMS, you export a selected component as a file in either SGML or XML format, so youll need a specialized editor or viewer to work with the files. In this respect, I think CMS and several similar programs leave users far short of the workflow and collaborative editing products that are currently available (for example, Lotus Notes); nor does it support any kind of distribution or replication of documents. Of course, most workflow programs dont yet support XML or SGML. The Content SDKFor those who wish to customize CMS databases and connect with other applications, the suite includes a Software Development Kit. The Poet Content SDK includes API libraries (classes and components) for C++ and Java along with ActiveX controls. In the beta version I tested, the documentation was sometimes sparse or inaccurate, but not difficult to follow (assuming youre an experienced programmer). As with most SDKs, it will take a while to assimilate the different components and work them into your development environment. The SDK installs a copy of the CMS Developer, which is similar to the Poet Developer for access to Poet databases but has been tailored for the Content Management Suite and is more useful in that context. You also get a copy of the new Web Factory, a product that provides a connection between CMS data and a Web server (Microsoft IIS 4.0 by default). When everything is installed correctly, you will require a POET CMS URL: localhost/ webfact.dll?object=poet://LOCAL//c:\ data\%3Foid=\\mybook\book1. At the heart of the Web Factory, Poet has used XML to fashion its own markup language, Poet XML (PXML), in order to create dynamic Web pages. This approach is similar to those taken in programs such as Allaire Cold Fusion that use HTML tag extensions to perform database and other processing activity. The programming APIs combined with the data management available from the Poet Object Server make Poet CMS a well-integrated and powerful development package. The ability to code business rules into the database and then attach them to multiple XML datasets with different DTDs is one good example of its strength. As a whole, the Poet Content Management SDK should be considered the strong point of CMS. Specialist ModeTo a certain extent, Poet CMS is at the mercy of the adoption rate for XML and SGML. CMS doesnt address the creation of documents, leaving the editing to specialized programs such as Arbor Text ADEPT. This, too, tends to lock CMS into a rather specialist mode. One of its competitors, Magnus Group Target 2000, is open to any kind of markup language XML, SGML, HTML, or something new. Given the fluid state of the Internet, this flexibility is probably a good idea. Because these markup languages are difficult to adopt, XML will certainly provide enough business for products such as Poet CMS. The Content Management Suite is a powerful collection of tools at a relatively low level of abstraction. I suspect that products that attempt to make XML much more visual and friendly for the average user will soon challenge Poets rather technical approach. However, with its complete SDK, Poet may hold the high ground as the best fundamental content management system available, as long as youre willing to match up its capabilities with custom applications.
Products Page 2IMPROMPTU 5.0Copyright © 2004 CMP Media Inc. ALL RIGHTS RESERVED No Reproduction without permission |
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| |||||||||||||||||||||||||||||||























